You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Matteo Grolla <ma...@gmail.com> on 2015/10/29 18:08:47 UTC

restore quorum after majority of zk nodes down

I'm designing a solr cloud installation where nodes from a single cluster
are distributed on 2 datacenters which are close and very well connected.
let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and let's say
that DC1 goes down and the cluster is left with zk3.
how can I restore a zk quorum from this situation?

thanks

Re: restore quorum after majority of zk nodes down

Posted by Pushkar Raste <pu...@gmail.com>.
We need bounce it, but outage will be very short and you don't have to take
down rest of the zookeeper instances.

On 30 October 2015 at 11:00, Daniel Collins <da...@gmail.com> wrote:

> Aren't you asking for dynamic ZK configuration which isn't supported yet
> (ZOOKEEPER-107, only in in 3.5.0-alpha)?  How do you swap a zookeeper
> instance from being an observer to a voting member?
>
> On 30 October 2015 at 09:34, Matteo Grolla <ma...@gmail.com>
> wrote:
>
> > Pushkar... I love this solution
> >       thanks
> > I'd just go with 3 zk nodes on each side
> >
> > 2015-10-29 23:46 GMT+01:00 Pushkar Raste <pu...@gmail.com>:
> >
> > > How about having let's say 4 nodes on each side and make one node in
> one
> > of
> > > data centers a observer. When data center with majority of the nodes go
> > > down, bounce the observer by reconfiguring it as a voting member.
> > >
> > > You will have to revert back the observer back to being one.
> > >
> > > There will be a short outage as far as indexing is concerned but
> queries
> > > should continue to work and you don't have to take all the zookeeper
> > nodes
> > > down.
> > >
> > > -- Pushkar Raste
> > > On Oct 29, 2015 4:33 PM, "Matteo Grolla" <ma...@gmail.com>
> > wrote:
> > >
> > > > Hi Walter,
> > > >       it's not a problem to take down zk for a short (1h) time and
> > > > reconfigure it. Meanwhile solr would go in readonly mode.
> > > > I'd like feedback on the fastest way to do this. Would it work to
> just
> > > > reconfigure the cluster with other 2 empty zk nodes? Would they
> > correctly
> > > > sync from the nonempty one? Should first copy data from zk3 to the
> two
> > > > empty zk?
> > > > Matteo
> > > >
> > > >
> > > > 2015-10-29 18:34 GMT+01:00 Walter Underwood <wu...@wunderwood.org>:
> > > >
> > > > > You can't. Zookeeper needs a majority. One node is not a majority
> of
> > a
> > > > > three node ensemble.
> > > > >
> > > > > There is no way to split a Solr Cloud cluster across two
> datacenters
> > > and
> > > > > have high availability. You can do that with three datacenters.
> > > > >
> > > > > You can probably bring up a new Zookeeper ensemble and configure
> the
> > > Solr
> > > > > cluster to talk to it.
> > > > >
> > > > > wunder
> > > > > Walter Underwood
> > > > > wunder@wunderwood.org
> > > > > http://observer.wunderwood.org/  (my blog)
> > > > >
> > > > >
> > > > > > On Oct 29, 2015, at 10:08 AM, Matteo Grolla <
> > matteo.grolla@gmail.com
> > > >
> > > > > wrote:
> > > > > >
> > > > > > I'm designing a solr cloud installation where nodes from a single
> > > > cluster
> > > > > > are distributed on 2 datacenters which are close and very well
> > > > connected.
> > > > > > let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and
> > > let's
> > > > > say
> > > > > > that DC1 goes down and the cluster is left with zk3.
> > > > > > how can I restore a zk quorum from this situation?
> > > > > >
> > > > > > thanks
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: restore quorum after majority of zk nodes down

Posted by Daniel Collins <da...@gmail.com>.
Aren't you asking for dynamic ZK configuration which isn't supported yet
(ZOOKEEPER-107, only in in 3.5.0-alpha)?  How do you swap a zookeeper
instance from being an observer to a voting member?

On 30 October 2015 at 09:34, Matteo Grolla <ma...@gmail.com> wrote:

> Pushkar... I love this solution
>       thanks
> I'd just go with 3 zk nodes on each side
>
> 2015-10-29 23:46 GMT+01:00 Pushkar Raste <pu...@gmail.com>:
>
> > How about having let's say 4 nodes on each side and make one node in one
> of
> > data centers a observer. When data center with majority of the nodes go
> > down, bounce the observer by reconfiguring it as a voting member.
> >
> > You will have to revert back the observer back to being one.
> >
> > There will be a short outage as far as indexing is concerned but queries
> > should continue to work and you don't have to take all the zookeeper
> nodes
> > down.
> >
> > -- Pushkar Raste
> > On Oct 29, 2015 4:33 PM, "Matteo Grolla" <ma...@gmail.com>
> wrote:
> >
> > > Hi Walter,
> > >       it's not a problem to take down zk for a short (1h) time and
> > > reconfigure it. Meanwhile solr would go in readonly mode.
> > > I'd like feedback on the fastest way to do this. Would it work to just
> > > reconfigure the cluster with other 2 empty zk nodes? Would they
> correctly
> > > sync from the nonempty one? Should first copy data from zk3 to the two
> > > empty zk?
> > > Matteo
> > >
> > >
> > > 2015-10-29 18:34 GMT+01:00 Walter Underwood <wu...@wunderwood.org>:
> > >
> > > > You can't. Zookeeper needs a majority. One node is not a majority of
> a
> > > > three node ensemble.
> > > >
> > > > There is no way to split a Solr Cloud cluster across two datacenters
> > and
> > > > have high availability. You can do that with three datacenters.
> > > >
> > > > You can probably bring up a new Zookeeper ensemble and configure the
> > Solr
> > > > cluster to talk to it.
> > > >
> > > > wunder
> > > > Walter Underwood
> > > > wunder@wunderwood.org
> > > > http://observer.wunderwood.org/  (my blog)
> > > >
> > > >
> > > > > On Oct 29, 2015, at 10:08 AM, Matteo Grolla <
> matteo.grolla@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > I'm designing a solr cloud installation where nodes from a single
> > > cluster
> > > > > are distributed on 2 datacenters which are close and very well
> > > connected.
> > > > > let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and
> > let's
> > > > say
> > > > > that DC1 goes down and the cluster is left with zk3.
> > > > > how can I restore a zk quorum from this situation?
> > > > >
> > > > > thanks
> > > >
> > > >
> > >
> >
>

Re: restore quorum after majority of zk nodes down

Posted by Matteo Grolla <ma...@gmail.com>.
Pushkar... I love this solution
      thanks
I'd just go with 3 zk nodes on each side

2015-10-29 23:46 GMT+01:00 Pushkar Raste <pu...@gmail.com>:

> How about having let's say 4 nodes on each side and make one node in one of
> data centers a observer. When data center with majority of the nodes go
> down, bounce the observer by reconfiguring it as a voting member.
>
> You will have to revert back the observer back to being one.
>
> There will be a short outage as far as indexing is concerned but queries
> should continue to work and you don't have to take all the zookeeper nodes
> down.
>
> -- Pushkar Raste
> On Oct 29, 2015 4:33 PM, "Matteo Grolla" <ma...@gmail.com> wrote:
>
> > Hi Walter,
> >       it's not a problem to take down zk for a short (1h) time and
> > reconfigure it. Meanwhile solr would go in readonly mode.
> > I'd like feedback on the fastest way to do this. Would it work to just
> > reconfigure the cluster with other 2 empty zk nodes? Would they correctly
> > sync from the nonempty one? Should first copy data from zk3 to the two
> > empty zk?
> > Matteo
> >
> >
> > 2015-10-29 18:34 GMT+01:00 Walter Underwood <wu...@wunderwood.org>:
> >
> > > You can't. Zookeeper needs a majority. One node is not a majority of a
> > > three node ensemble.
> > >
> > > There is no way to split a Solr Cloud cluster across two datacenters
> and
> > > have high availability. You can do that with three datacenters.
> > >
> > > You can probably bring up a new Zookeeper ensemble and configure the
> Solr
> > > cluster to talk to it.
> > >
> > > wunder
> > > Walter Underwood
> > > wunder@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > >
> > > > On Oct 29, 2015, at 10:08 AM, Matteo Grolla <matteo.grolla@gmail.com
> >
> > > wrote:
> > > >
> > > > I'm designing a solr cloud installation where nodes from a single
> > cluster
> > > > are distributed on 2 datacenters which are close and very well
> > connected.
> > > > let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and
> let's
> > > say
> > > > that DC1 goes down and the cluster is left with zk3.
> > > > how can I restore a zk quorum from this situation?
> > > >
> > > > thanks
> > >
> > >
> >
>

Re: restore quorum after majority of zk nodes down

Posted by Pushkar Raste <pu...@gmail.com>.
How about having let's say 4 nodes on each side and make one node in one of
data centers a observer. When data center with majority of the nodes go
down, bounce the observer by reconfiguring it as a voting member.

You will have to revert back the observer back to being one.

There will be a short outage as far as indexing is concerned but queries
should continue to work and you don't have to take all the zookeeper nodes
down.

-- Pushkar Raste
On Oct 29, 2015 4:33 PM, "Matteo Grolla" <ma...@gmail.com> wrote:

> Hi Walter,
>       it's not a problem to take down zk for a short (1h) time and
> reconfigure it. Meanwhile solr would go in readonly mode.
> I'd like feedback on the fastest way to do this. Would it work to just
> reconfigure the cluster with other 2 empty zk nodes? Would they correctly
> sync from the nonempty one? Should first copy data from zk3 to the two
> empty zk?
> Matteo
>
>
> 2015-10-29 18:34 GMT+01:00 Walter Underwood <wu...@wunderwood.org>:
>
> > You can't. Zookeeper needs a majority. One node is not a majority of a
> > three node ensemble.
> >
> > There is no way to split a Solr Cloud cluster across two datacenters and
> > have high availability. You can do that with three datacenters.
> >
> > You can probably bring up a new Zookeeper ensemble and configure the Solr
> > cluster to talk to it.
> >
> > wunder
> > Walter Underwood
> > wunder@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >
> > > On Oct 29, 2015, at 10:08 AM, Matteo Grolla <ma...@gmail.com>
> > wrote:
> > >
> > > I'm designing a solr cloud installation where nodes from a single
> cluster
> > > are distributed on 2 datacenters which are close and very well
> connected.
> > > let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and let's
> > say
> > > that DC1 goes down and the cluster is left with zk3.
> > > how can I restore a zk quorum from this situation?
> > >
> > > thanks
> >
> >
>

Re: restore quorum after majority of zk nodes down

Posted by Matteo Grolla <ma...@gmail.com>.
Hi Walter,
      it's not a problem to take down zk for a short (1h) time and
reconfigure it. Meanwhile solr would go in readonly mode.
I'd like feedback on the fastest way to do this. Would it work to just
reconfigure the cluster with other 2 empty zk nodes? Would they correctly
sync from the nonempty one? Should first copy data from zk3 to the two
empty zk?
Matteo


2015-10-29 18:34 GMT+01:00 Walter Underwood <wu...@wunderwood.org>:

> You can't. Zookeeper needs a majority. One node is not a majority of a
> three node ensemble.
>
> There is no way to split a Solr Cloud cluster across two datacenters and
> have high availability. You can do that with three datacenters.
>
> You can probably bring up a new Zookeeper ensemble and configure the Solr
> cluster to talk to it.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Oct 29, 2015, at 10:08 AM, Matteo Grolla <ma...@gmail.com>
> wrote:
> >
> > I'm designing a solr cloud installation where nodes from a single cluster
> > are distributed on 2 datacenters which are close and very well connected.
> > let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and let's
> say
> > that DC1 goes down and the cluster is left with zk3.
> > how can I restore a zk quorum from this situation?
> >
> > thanks
>
>

Re: restore quorum after majority of zk nodes down

Posted by Walter Underwood <wu...@wunderwood.org>.
You can't. Zookeeper needs a majority. One node is not a majority of a three node ensemble.

There is no way to split a Solr Cloud cluster across two datacenters and have high availability. You can do that with three datacenters.

You can probably bring up a new Zookeeper ensemble and configure the Solr cluster to talk to it.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Oct 29, 2015, at 10:08 AM, Matteo Grolla <ma...@gmail.com> wrote:
> 
> I'm designing a solr cloud installation where nodes from a single cluster
> are distributed on 2 datacenters which are close and very well connected.
> let's say that zk nodes zk1, zk2 are on DC1 and zk2 is on DC2 and let's say
> that DC1 goes down and the cluster is left with zk3.
> how can I restore a zk quorum from this situation?
> 
> thanks