You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Ted Dunning <te...@gmail.com> on 2018/10/23 02:02:12 UTC

improving tolerance to network failures

I am starting work on a project to improve the tolerance of Zookeeper to
network failures and would like feedback on the idea.

The problem is that with environments where link bonding is forbidden (they
exist, trust me), Zookeeper is sensitive to the loss of a single switch or
a few network links. This applies to client and server.

Upon examination of the problem, I think that this could be mitigated by
changing the logic that opens connections between servers to try one of
several options. This should be a small change. I think that dynamic
reconfiguration should be fine with this as well.

On the client side, the situation is simpler, we can simply provide, either
by configuration or from the server cluster, a list of all possible
addresses and the client's current connection logic should work fine.

One worry I have has to do with certificates on secure connection, but it
seems that multiple certs would work the trick.

I have started a collaborative document to work on the design approach.
Once that is judged by the community to be sufficiently mature, I will move
it to a JIRA.

That document is at
https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing

The design document is currently open to the world for commenting so that
anybody can suggest changes or ask questions. I will act as a bit of a
moderator so that the document can remain completely open.

Re: improving tolerance to network failures

Posted by Ted Dunning <te...@gmail.com>.
Michael,

I wouldn't characterize the current proposal as broken so much as it talks
about connection balancing rather than server balancing. Other than that, I
think I agree with what you are saying.

So we have two folks with a feeling that server balancing from the client
side is significantly better than connection balancing. I had thought that
this would be desirable to defer in the interest of code simplicity. That
may not be the right balance.

The point about hardware upgrades is a very good one.




On Tue, Oct 23, 2018 at 10:21 AM Michael Han <ha...@apache.org> wrote:

> >> Will there be a code effect?
>
> There will be - the current rebalancing algorithm will be broken if no code
> is done to StaticHostProvider.updateServerList to teach it aware of
> multiple server addresses belong to the same server. For example, currently
> if we add a new server through reconfig, the rebalance will kick in. In the
> new proposal, if we add a new address to the existing server, if no code
> change made to updateServerList, the rebalance will also kick in but it
> should not, as in this case no new real servers are added.
>
> >> My own experience is that production settings typically involve
> Zookeeper servers with very consistent hardware where this would not be an
> issue.
>
> I think this is generally true, but we should consider cases where user is
> upgrading hardware, which might take a while and during this time it would
> be ideal if ZK offer the capability of balanced client connections across
> ensemble with heterogeneous hardwares. As a user myself, I'd like to have
> this feature, especially consider it seems not hard to implement. What Alex
> proposed should work. Another approach might be to assign weights to each
> address (a single server has weight one), and this will reduce to a
> weighted random selection problem.
>
> Overall, I think this proposal has little impact on server side, most
> impact is on client side.
>
>
> On Tue, Oct 23, 2018 at 9:34 AM Ted Dunning <td...@apache.org> wrote:
>
> > There have been several comments on the document. I will be porting
> > discussions from the document back to the mailing list each day.
> >
> > Alex Shraer makes a good point that with the design as stated, there is
> no
> > provision for dealing with the rebalancing of client connections during
> > dynamic reconfiguration. I am very curious whether this needs to be
> > addressed in the design since it seems that if connections are
> redirected,
> > the same connection logic should apply. I suppose the text needs an
> update,
> > regardless, even if there is no effect. But is there something I missed
> > here? Will there be a code effect?
> >
> > Another comment points out that if you don't have symmetrical hardware
> for
> > the servers (i.e. more network interfaces on some), then client
> connections
> > are likely to be more numerous on servers with more network connections.
> > This is undoubtedly true.
> >
> > I have a question, however, about this. Is this situation actually
> > important enough to make the first version of this change? My own
> > experience is that production settings typically involve Zookeeper
> servers
> > with very consistent hardware where this would not be an issue.
> >
> > What experience do others have, particularly in production situations?
> >
> > On 2018/10/23 02:02:12, Ted Dunning <te...@gmail.com> wrote:
> > > ...
> > > I have started a collaborative document to work on the design approach.
> > > Once that is judged by the community to be sufficiently mature, I will
> > move
> > > it to a JIRA.
> > >
> > > That document is at
> > >
> >
> https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing
> > >
> > > The design document is currently open to the world for commenting so
> that
> > > anybody can suggest changes or ask questions. I will act as a bit of a
> > > moderator so that the document can remain completely open.
> > >
> >
>

Re: improving tolerance to network failures

Posted by Michael Han <ha...@apache.org>.
>> Will there be a code effect?

There will be - the current rebalancing algorithm will be broken if no code
is done to StaticHostProvider.updateServerList to teach it aware of
multiple server addresses belong to the same server. For example, currently
if we add a new server through reconfig, the rebalance will kick in. In the
new proposal, if we add a new address to the existing server, if no code
change made to updateServerList, the rebalance will also kick in but it
should not, as in this case no new real servers are added.

>> My own experience is that production settings typically involve
Zookeeper servers with very consistent hardware where this would not be an
issue.

I think this is generally true, but we should consider cases where user is
upgrading hardware, which might take a while and during this time it would
be ideal if ZK offer the capability of balanced client connections across
ensemble with heterogeneous hardwares. As a user myself, I'd like to have
this feature, especially consider it seems not hard to implement. What Alex
proposed should work. Another approach might be to assign weights to each
address (a single server has weight one), and this will reduce to a
weighted random selection problem.

Overall, I think this proposal has little impact on server side, most
impact is on client side.


On Tue, Oct 23, 2018 at 9:34 AM Ted Dunning <td...@apache.org> wrote:

> There have been several comments on the document. I will be porting
> discussions from the document back to the mailing list each day.
>
> Alex Shraer makes a good point that with the design as stated, there is no
> provision for dealing with the rebalancing of client connections during
> dynamic reconfiguration. I am very curious whether this needs to be
> addressed in the design since it seems that if connections are redirected,
> the same connection logic should apply. I suppose the text needs an update,
> regardless, even if there is no effect. But is there something I missed
> here? Will there be a code effect?
>
> Another comment points out that if you don't have symmetrical hardware for
> the servers (i.e. more network interfaces on some), then client connections
> are likely to be more numerous on servers with more network connections.
> This is undoubtedly true.
>
> I have a question, however, about this. Is this situation actually
> important enough to make the first version of this change? My own
> experience is that production settings typically involve Zookeeper servers
> with very consistent hardware where this would not be an issue.
>
> What experience do others have, particularly in production situations?
>
> On 2018/10/23 02:02:12, Ted Dunning <te...@gmail.com> wrote:
> > ...
> > I have started a collaborative document to work on the design approach.
> > Once that is judged by the community to be sufficiently mature, I will
> move
> > it to a JIRA.
> >
> > That document is at
> >
> https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing
> >
> > The design document is currently open to the world for commenting so that
> > anybody can suggest changes or ask questions. I will act as a bit of a
> > moderator so that the document can remain completely open.
> >
>

Re: improving tolerance to network failures

Posted by Ted Dunning <td...@apache.org>.
There have been several comments on the document. I will be porting discussions from the document back to the mailing list each day.

Alex Shraer makes a good point that with the design as stated, there is no provision for dealing with the rebalancing of client connections during dynamic reconfiguration. I am very curious whether this needs to be addressed in the design since it seems that if connections are redirected, the same connection logic should apply. I suppose the text needs an update, regardless, even if there is no effect. But is there something I missed here? Will there be a code effect?

Another comment points out that if you don't have symmetrical hardware for the servers (i.e. more network interfaces on some), then client connections are likely to be more numerous on servers with more network connections. This is undoubtedly true.

I have a question, however, about this. Is this situation actually important enough to make the first version of this change? My own experience is that production settings typically involve Zookeeper servers with very consistent hardware where this would not be an issue.

What experience do others have, particularly in production situations?

On 2018/10/23 02:02:12, Ted Dunning <te...@gmail.com> wrote: 
> ...
> I have started a collaborative document to work on the design approach.
> Once that is judged by the community to be sufficiently mature, I will move
> it to a JIRA.
> 
> That document is at
> https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing
> 
> The design document is currently open to the world for commenting so that
> anybody can suggest changes or ask questions. I will act as a bit of a
> moderator so that the document can remain completely open.
>