You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Brian Tarbox <br...@gmail.com> on 2012/12/05 21:27:44 UTC

very uneven distribution of clients to servers...

I have a three node cluster and am getting a very uneven distribution of
client connections to the servers.

The middle server in the connection string list seems to get a tiny slice
of the clients. I understand its "random" and so I don't expect perfect
distribution but I'm seeing very consistent numbers like:
12 : 3 : 14 for the connection counts.

I run the zkTop script (excellent!) and so I can see the connection counts
and I've _never_ seen the middle server get more than a few connections.

If I reorder the server addresses in the connection string its always the
middle server that gets short changed.

Any suggestions / insights?

Running 3.4.5.

Thanks.

-- 
http://about.me/BrianTarbox

Re: very uneven distribution of clients to servers...

Posted by Ted Dunning <te...@gmail.com>.
Kishore,

That should be a good explanation, but it depends on where the returning
node gets put into the replication chain.

Most replicating systems get put at the end of a replication chain since
that causes the least disruption.

I don't know what ZK does, but this can be tested by determining whether
the middle node is actually the youngest.

Brian,

Can you say if these are new connections that are being placed
non-uniformly?  Or are these existing connections?

Also, are you running things so hot that this will matter?


On Wed, Dec 5, 2012 at 11:01 PM, kishore g <g....@gmail.com> wrote:

> Did all the clients connect after all the 3 nodes were up and running ? One
> reason why you find this uneven distribution is when you restart one of the
> zookeeper nodes all the clients reconnect to one of the remaining nodes but
> then dont reconnect back to original node when it come back up.
>
> Can you confirm this is not the case?
>
>
> On Wed, Dec 5, 2012 at 12:27 PM, Brian Tarbox <br...@gmail.com>
> wrote:
>
> > I have a three node cluster and am getting a very uneven distribution of
> > client connections to the servers.
> >
> > The middle server in the connection string list seems to get a tiny slice
> > of the clients. I understand its "random" and so I don't expect perfect
> > distribution but I'm seeing very consistent numbers like:
> > 12 : 3 : 14 for the connection counts.
> >
> > I run the zkTop script (excellent!) and so I can see the connection
> counts
> > and I've _never_ seen the middle server get more than a few connections.
> >
> > If I reorder the server addresses in the connection string its always the
> > middle server that gets short changed.
> >
> > Any suggestions / insights?
> >
> > Running 3.4.5.
> >
> > Thanks.
> >
> > --
> > http://about.me/BrianTarbox
> >
>

Re: very uneven distribution of clients to servers...

Posted by Alexander Shraer <sh...@gmail.com>.
or as I suggested just use the trunk where this is already done

On Thu, Dec 6, 2012 at 5:25 PM, Ted Dunning <te...@gmail.com> wrote:
> Next test would be to incorporate a small patch to seed the shuffling of
> server names by the client ID or something.
>
> On Thu, Dec 6, 2012 at 12:32 PM, Brian Tarbox <br...@gmail.com> wrote:
>
>> I killed all the clients and then restarted them do its not a reconnection
>> issue .
>>

Re: very uneven distribution of clients to servers...

Posted by Ted Dunning <te...@gmail.com>.
Next test would be to incorporate a small patch to seed the shuffling of
server names by the client ID or something.

On Thu, Dec 6, 2012 at 12:32 PM, Brian Tarbox <br...@gmail.com> wrote:

> I killed all the clients and then restarted them do its not a reconnection
> issue .
>

Re: very uneven distribution of clients to servers...

Posted by Brian Tarbox <br...@gmail.com>.
Thanks for all the thoughts.

We are using the java client.
I killed all the clients and then restarted them do its not a reconnection issue .

Sent from my iPhone

On Dec 5, 2012, at 6:15 PM, Alexander Shraer <sh...@gmail.com> wrote:

> I have some experience with this that may be useful (Marshall and I
> worked on the re-shuffling of clients to servers as part of ZK-1355).
> 
> Kishore's suggestion is the first thing I would check, but there is
> another possibility.
> 
> We were seeing the same issue - the distribution of clients across
> servers was not even (we allow some slack in the tests, but still it
> would frequently be very uneven). The reason ended up being that
> different clients were shuffling the list in the same way, and when I
> incorporated the client's id into the seed the problem went away. I'm
> curious whether this is the problem in this case too - this can be
> easily tested by trying whether the problem is still there with the
> trunk distribution (ZK-1355 is in the trunk) or just applying ZK-1355.
> 
> We're currently seeing the same thing with our C tests:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1594
> But haven't yet tried whether the same solution would work there.
> 
> Alex
> 
> 
> On Wed, Dec 5, 2012 at 12:54 PM, Ted Dunning <te...@gmail.com> wrote:
>> Shuffle depends on Math.random which is seeded by time of start.  That
>> should be just fine.
>> 
>> On Wed, Dec 5, 2012 at 11:35 PM, Camille Fournier <ca...@apache.org>wrote:
>> 
>>> If you're using the Java ZooKeeper client, you can see in the code that the
>>> way connections are established is that we parse the server list, resolve
>>> them to inet addresses, and call Collections.shuffle on the list of server
>>> addresses. It's possible that Collections.shuffle is not random enough but
>>> I suspect that there's something else happening.
>>> 

Re: very uneven distribution of clients to servers...

Posted by Alexander Shraer <sh...@gmail.com>.
I have some experience with this that may be useful (Marshall and I
worked on the re-shuffling of clients to servers as part of ZK-1355).

Kishore's suggestion is the first thing I would check, but there is
another possibility.

We were seeing the same issue - the distribution of clients across
servers was not even (we allow some slack in the tests, but still it
would frequently be very uneven). The reason ended up being that
different clients were shuffling the list in the same way, and when I
incorporated the client's id into the seed the problem went away. I'm
curious whether this is the problem in this case too - this can be
easily tested by trying whether the problem is still there with the
trunk distribution (ZK-1355 is in the trunk) or just applying ZK-1355.

We're currently seeing the same thing with our C tests:
https://issues.apache.org/jira/browse/ZOOKEEPER-1594
But haven't yet tried whether the same solution would work there.

Alex


On Wed, Dec 5, 2012 at 12:54 PM, Ted Dunning <te...@gmail.com> wrote:
> Shuffle depends on Math.random which is seeded by time of start.  That
> should be just fine.
>
> On Wed, Dec 5, 2012 at 11:35 PM, Camille Fournier <ca...@apache.org>wrote:
>
>> If you're using the Java ZooKeeper client, you can see in the code that the
>> way connections are established is that we parse the server list, resolve
>> them to inet addresses, and call Collections.shuffle on the list of server
>> addresses. It's possible that Collections.shuffle is not random enough but
>> I suspect that there's something else happening.
>>

Re: very uneven distribution of clients to servers...

Posted by Ted Dunning <te...@gmail.com>.
Shuffle depends on Math.random which is seeded by time of start.  That
should be just fine.

On Wed, Dec 5, 2012 at 11:35 PM, Camille Fournier <ca...@apache.org>wrote:

> If you're using the Java ZooKeeper client, you can see in the code that the
> way connections are established is that we parse the server list, resolve
> them to inet addresses, and call Collections.shuffle on the list of server
> addresses. It's possible that Collections.shuffle is not random enough but
> I suspect that there's something else happening.
>

Re: very uneven distribution of clients to servers...

Posted by Camille Fournier <ca...@apache.org>.
Kishore has a good idea to investigate.

If you're using the Java ZooKeeper client, you can see in the code that the
way connections are established is that we parse the server list, resolve
them to inet addresses, and call Collections.shuffle on the list of server
addresses. It's possible that Collections.shuffle is not random enough but
I suspect that there's something else happening.
If you're not using the Java client, I can't comment, might be worth
glancing at the source code for your client library to make sure it looks
sane and there's not a bug.


On Wed, Dec 5, 2012 at 5:01 PM, kishore g <g....@gmail.com> wrote:

> Did all the clients connect after all the 3 nodes were up and running ? One
> reason why you find this uneven distribution is when you restart one of the
> zookeeper nodes all the clients reconnect to one of the remaining nodes but
> then dont reconnect back to original node when it come back up.
>
> Can you confirm this is not the case?
>
>
> On Wed, Dec 5, 2012 at 12:27 PM, Brian Tarbox <br...@gmail.com>
> wrote:
>
> > I have a three node cluster and am getting a very uneven distribution of
> > client connections to the servers.
> >
> > The middle server in the connection string list seems to get a tiny slice
> > of the clients. I understand its "random" and so I don't expect perfect
> > distribution but I'm seeing very consistent numbers like:
> > 12 : 3 : 14 for the connection counts.
> >
> > I run the zkTop script (excellent!) and so I can see the connection
> counts
> > and I've _never_ seen the middle server get more than a few connections.
> >
> > If I reorder the server addresses in the connection string its always the
> > middle server that gets short changed.
> >
> > Any suggestions / insights?
> >
> > Running 3.4.5.
> >
> > Thanks.
> >
> > --
> > http://about.me/BrianTarbox
> >
>

Re: very uneven distribution of clients to servers...

Posted by kishore g <g....@gmail.com>.
Did all the clients connect after all the 3 nodes were up and running ? One
reason why you find this uneven distribution is when you restart one of the
zookeeper nodes all the clients reconnect to one of the remaining nodes but
then dont reconnect back to original node when it come back up.

Can you confirm this is not the case?


On Wed, Dec 5, 2012 at 12:27 PM, Brian Tarbox <br...@gmail.com> wrote:

> I have a three node cluster and am getting a very uneven distribution of
> client connections to the servers.
>
> The middle server in the connection string list seems to get a tiny slice
> of the clients. I understand its "random" and so I don't expect perfect
> distribution but I'm seeing very consistent numbers like:
> 12 : 3 : 14 for the connection counts.
>
> I run the zkTop script (excellent!) and so I can see the connection counts
> and I've _never_ seen the middle server get more than a few connections.
>
> If I reorder the server addresses in the connection string its always the
> middle server that gets short changed.
>
> Any suggestions / insights?
>
> Running 3.4.5.
>
> Thanks.
>
> --
> http://about.me/BrianTarbox
>