You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Ivan Kelly <iv...@apache.org> on 2021/08/05 14:37:22 UTC

Re: [EXTERNAL EMAIL] - Re: zookeeper ensemble on two AWS AZ

> Promoting the 2 observers to participant will be a manual step (as part of disaster recovery) to get the cluster up. During this manual step, if needed, we can shutdown/terminate the old AZ instances.
> We also have puppet managing configuration. Puppet module will be updated to reflect new cluster instances. So when, if the AZ comes up, puppet will see that these instances are no longer part of the zookeeper cluster and module will stop zookeeper service.
What guarantee do you have that all clients will have switched over to
the new cluster? Even if puppet will shutdown the old cluster, it will
take time to see that it needs to be shut down, which creates an
opportunity for clients to connect and do stuff.

> A side question: Will observers will always in sync with entire cluster? in other words when observers will be in sync with the quorum participants?
By in sync, I take it that you mean that any write that was
acknowledged by the initial cluster exists in the failover cluster.
No, they may not be in-sync. The observer will always have a prefix of
the log of the participants. This prefix may be the entire log, or it
may be missing the latest writes. This is true even if you have a
participant in the failover AZ. For a write to be acknowledged, it has
to hit a majority of the quorum.

With 2 AZs, 1 AZ will always have a majority, so if it goes down,
writes will be missing from the other AZ. The exception to this is
where there's an even number of participant in each AZ. In this case,
you one AZ goes down, you can no longer form a majority, but all
writes will exist on both AZs. Maybe this could be a path forward,
since you accept that you will have manual failover. I'm not sure how
well this scenario is supported in the tooling though.

-Ivan

Re: [EXTERNAL EMAIL] - Re: zookeeper ensemble on two AWS AZ

Posted by Shailesh Ligade <SL...@FBI.GOV.INVALID>.

Thanks,

That would be lot easier, but the airgap environment we are, we have only 2 AZs 🙁
thats why I was thinking for using observers and in case of failure, using dynamic configuration updates, promote observers to participant.. I think this will work, provided, the znode data is in sync..

-S

________________________________
From: Zhewei Hu <zh...@gmail.com>
Sent: Friday, August 6, 2021 2:38 AM
To: user@zookeeper.apache.org <us...@zookeeper.apache.org>
Subject: Re: [EXTERNAL EMAIL] - Re: zookeeper ensemble on two AWS AZ

Well, why not set up the zk ensemble across 3 AZs? In this case, for 5-node
ensemble, we can have 2, 2, 1 zk servers per AZ. Then if any AZ is down,
the ensemble is still up. It will also work for 7-node ensemble, we can
have 3, 2, 2 zk servers per AZ.

HTH,
Zhewei

On Thu, Aug 5, 2021 at 07:38 Ivan Kelly <iv...@apache.org> wrote:

> *typo, the exception is when there's an _equal_ number of participants
>
> On Thu, Aug 5, 2021 at 3:37 PM Ivan Kelly <iv...@apache.org> wrote:
> >
> > > Promoting the 2 observers to participant will be a manual step (as
> part of disaster recovery) to get the cluster up. During this manual step,
> if needed, we can shutdown/terminate the old AZ instances.
> > > We also have puppet managing configuration. Puppet module will be
> updated to reflect new cluster instances. So when, if the AZ comes up,
> puppet will see that these instances are no longer part of the zookeeper
> cluster and module will stop zookeeper service.
> > What guarantee do you have that all clients will have switched over to
> > the new cluster? Even if puppet will shutdown the old cluster, it will
> > take time to see that it needs to be shut down, which creates an
> > opportunity for clients to connect and do stuff.
> >
> > > A side question: Will observers will always in sync with entire
> cluster? in other words when observers will be in sync with the quorum
> participants?
> > By in sync, I take it that you mean that any write that was
> > acknowledged by the initial cluster exists in the failover cluster.
> > No, they may not be in-sync. The observer will always have a prefix of
> > the log of the participants. This prefix may be the entire log, or it
> > may be missing the latest writes. This is true even if you have a
> > participant in the failover AZ. For a write to be acknowledged, it has
> > to hit a majority of the quorum.
> >
> > With 2 AZs, 1 AZ will always have a majority, so if it goes down,
> > writes will be missing from the other AZ. The exception to this is
> > where there's an even number of participant in each AZ. In this case,
> > you one AZ goes down, you can no longer form a majority, but all
> > writes will exist on both AZs. Maybe this could be a path forward,
> > since you accept that you will have manual failover. I'm not sure how
> > well this scenario is supported in the tooling though.
> >
> > -Ivan
>

Re: [EXTERNAL EMAIL] - Re: zookeeper ensemble on two AWS AZ

Posted by Zhewei Hu <zh...@gmail.com>.

Well, why not set up the zk ensemble across 3 AZs? In this case, for 5-node
ensemble, we can have 2, 2, 1 zk servers per AZ. Then if any AZ is down,
the ensemble is still up. It will also work for 7-node ensemble, we can
have 3, 2, 2 zk servers per AZ.

HTH,
Zhewei

On Thu, Aug 5, 2021 at 07:38 Ivan Kelly <iv...@apache.org> wrote:

> *typo, the exception is when there's an _equal_ number of participants
>
> On Thu, Aug 5, 2021 at 3:37 PM Ivan Kelly <iv...@apache.org> wrote:
> >
> > > Promoting the 2 observers to participant will be a manual step (as
> part of disaster recovery) to get the cluster up. During this manual step,
> if needed, we can shutdown/terminate the old AZ instances.
> > > We also have puppet managing configuration. Puppet module will be
> updated to reflect new cluster instances. So when, if the AZ comes up,
> puppet will see that these instances are no longer part of the zookeeper
> cluster and module will stop zookeeper service.
> > What guarantee do you have that all clients will have switched over to
> > the new cluster? Even if puppet will shutdown the old cluster, it will
> > take time to see that it needs to be shut down, which creates an
> > opportunity for clients to connect and do stuff.
> >
> > > A side question: Will observers will always in sync with entire
> cluster? in other words when observers will be in sync with the quorum
> participants?
> > By in sync, I take it that you mean that any write that was
> > acknowledged by the initial cluster exists in the failover cluster.
> > No, they may not be in-sync. The observer will always have a prefix of
> > the log of the participants. This prefix may be the entire log, or it
> > may be missing the latest writes. This is true even if you have a
> > participant in the failover AZ. For a write to be acknowledged, it has
> > to hit a majority of the quorum.
> >
> > With 2 AZs, 1 AZ will always have a majority, so if it goes down,
> > writes will be missing from the other AZ. The exception to this is
> > where there's an even number of participant in each AZ. In this case,
> > you one AZ goes down, you can no longer form a majority, but all
> > writes will exist on both AZs. Maybe this could be a path forward,
> > since you accept that you will have manual failover. I'm not sure how
> > well this scenario is supported in the tooling though.
> >
> > -Ivan
>

Re: [EXTERNAL EMAIL] - Re: zookeeper ensemble on two AWS AZ

Posted by Ivan Kelly <iv...@apache.org>.

*typo, the exception is when there's an _equal_ number of participants

On Thu, Aug 5, 2021 at 3:37 PM Ivan Kelly <iv...@apache.org> wrote:
>
> > Promoting the 2 observers to participant will be a manual step (as part of disaster recovery) to get the cluster up. During this manual step, if needed, we can shutdown/terminate the old AZ instances.
> > We also have puppet managing configuration. Puppet module will be updated to reflect new cluster instances. So when, if the AZ comes up, puppet will see that these instances are no longer part of the zookeeper cluster and module will stop zookeeper service.
> What guarantee do you have that all clients will have switched over to
> the new cluster? Even if puppet will shutdown the old cluster, it will
> take time to see that it needs to be shut down, which creates an
> opportunity for clients to connect and do stuff.
>
> > A side question: Will observers will always in sync with entire cluster? in other words when observers will be in sync with the quorum participants?
> By in sync, I take it that you mean that any write that was
> acknowledged by the initial cluster exists in the failover cluster.
> No, they may not be in-sync. The observer will always have a prefix of
> the log of the participants. This prefix may be the entire log, or it
> may be missing the latest writes. This is true even if you have a
> participant in the failover AZ. For a write to be acknowledged, it has
> to hit a majority of the quorum.
>
> With 2 AZs, 1 AZ will always have a majority, so if it goes down,
> writes will be missing from the other AZ. The exception to this is
> where there's an even number of participant in each AZ. In this case,
> you one AZ goes down, you can no longer form a majority, but all
> writes will exist on both AZs. Maybe this could be a path forward,
> since you accept that you will have manual failover. I'm not sure how
> well this scenario is supported in the tooling though.
>
> -Ivan