You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by "Marco P." <ma...@gmail.com> on 2015/08/27 02:23:51 UTC

Manually reducing client imbalance

Hello all,

it is a known issue that client can get unbalanced over time, leaving a
few server doing all the work while other are idle.

Long term solutions for this have been discussed (e.g.:
https://issues.apache.org/jira/browse/ZOOKEEPER-856), and I can't wait
to see some progress there.

In the meantime, there is a specific instance of this problem that I'd
like to get feedback on, and maybe try a patch if the idea is well received.

The problem exists in clusters with large number of clients (say,
10'000) where we want to perform a rolling bounce (i.e. restarting all
servers one by one to avoid causing downtime).

If we start in a situation like this:

  1 : follower : 2000 clients
  2 : follower : 2000 clients
  3 : follower : 2000 clients
  4 : follower : 2000 clients
  5 : leader   : 2000 clients

And proceed to bounce all servers, leaving the leader last (to minimize
the number of leadership changes), we end up in the situation below
right before the leader is bounced (complete list of steps below):

  1 : follower : 2381 clients (bounced)
  2 : follower : 1756 clients (bounced)
  3 : follower : 976  clients (bounced)
  4 : follower : 0    clients (bounced)
  5 : leader   : 4881 clients (not bounced yet)


Now we're going to bounce the leader, which by itself causes some
commotion. Almost half of the clients at the same time are going to have
to scramble to find a new server to connect to.

In some cases, we've seen this go wrong. The leader bounce combined with
a large number of clients migrating en-masse has a ripple effect that
can causes sessions to expire and followers to fall behind.


Motivated by this scenario, here's a proposal (again, just a stop-gap
solution while waiting for a long-term solution to client imbalance).

Would it be reasonable to introduce a 4-letter word that forces a server
to shed part of its clients?

e.g. "sh10" tells a server to shed 10% of its clients, "sh50" tells a
server to shed 50%, etc.

This command could be used in the scenario above to gradually migrate
most of the clients away from #5 before bouncing it.
(and in general in any other situation where we want to gently move
clients away from a server before taking it down for maintenance).

After a bounce is complete, it could be used to restore some balance
manually (e.g. by hitting the most loaded server with "sh10" a few times).


Is this something that users would find useful?
Is this something developers accept into the system?

If so, I'd be happy to try and contribute this myself, with some guidance.


Cheers,
M.


---

Full sequence of steps for the numbers provided above.

  After bouncing 1:
  1 : 0
  2 : 2500
  3 : 2500
  4 : 2500
  5 : 2500

  After bouncing 2:
  1 : 625
  2 : 0
  3 : 3125
  4 : 3125
  5 : 3125

  After bouncing 3:
  1 : 1406
  2 : 781
  3 : 0
  4 : 3906
  5 : 3906

  After bouncing 4:
  1 : 2381
  2 : 1756
  3 : 976
  4 : 0
  5 : 4881

  After bouncing 4:
  1 : 2381
  2 : 1756
  3 : 976
  4 : 0
  5 : 4881