You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Aaron Crow <di...@yahoo.com> on 2010/05/13 19:18:36 UTC

Can't ls with large node count and I don't understand the use of jute.maxbuffer

We're running Zookeeper with about 2 million nodes. It's working, with one
specific exception: When I try to get all children on one of the main node
trees, I get an IOException out of ClientCnxn ("Packet len4648067 is out of
range!"). There are 150329 children under the node in question. I should
also mention that I can successfully ls other nodes with similarly high
children counts. But this specific node always fails.

Googling led me to see that Mahadev dealt with this last year:
http://www.mail-archive.com/zookeeper-commits@hadoop.apache.org/msg00175.html

Source diving led me to see that ClientCnxn enforces a bound based on
the jute.maxbuffer setting:

> packetLen = Integer.getInteger("jute.maxbuffer", 4096 * 1024);

...

if (len < 0 || len >= packetLen) {

  throw new IOException("Packet len" + len + " is out of range!");


So maybe I could bump this up in config... but, I'm confused when reading
the documentation on jute.maxbuffer:
"It specifies the maximum size of the data that can be stored in a znode."

It's true we have an extremely high node count. However, we've been careful
to keep each node's data very small -- e.g., we certainly should have no
single data entry longer than 256 characters.  The way I'm reading the docs,
the jute.maxbuffer bound is purely against the data size of specific nodes,
and shouldn't relate to child count. Or does it relate to child count as
well?

Here is a stat on the offending node:

cZxid = 0x10000000e

ctime = Mon May 03 17:40:58 PDT 2010

mZxid = 0x10000000e

mtime = Mon May 03 17:40:58 PDT 2010

pZxid = 0x100315064

cversion = 150654

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 0

numChildren = 150372


Thanks for any insights...


Aaron

Re: Can't ls with large node count and I don't understand the use of jute.maxbuffer

Posted by Mahadev Konar <ma...@yahoo-inc.com>.
Hi Aaaron,
  Each of the requests and response between client and servers is sent an
(buflen, buffer) packet. The content of the packets are then deserialized
from this buffer. 

Looks like the size of the packet (buflen) is big in yoru case. We usually
avoid sending/receiving large packets just to discourage folks from using it
as bulk data store.

We also discourage creating a flat hierarchy with too many direct children
(your case). This is because such directories can cause huge load on
network/servers when an list on that directores are done by a huge number of
clients. We always suggest to bucket these children into more hierarchical
structure.

You are probably hitting the limit of 1MB for this! You might want to change
this in your client configuration as a temporary fix! But for later you
might want to think about out structure in ZooKeeper to make it more
hierarchical via some kind of bucketing!

Thanks
mahadev




On 5/13/10 10:18 AM, "Aaron Crow" <di...@yahoo.com> wrote:

> We're running Zookeeper with about 2 million nodes. It's working, with one
> specific exception: When I try to get all children on one of the main node
> trees, I get an IOException out of ClientCnxn ("Packet len4648067 is out of
> range!"). There are 150329 children under the node in question. I should
> also mention that I can successfully ls other nodes with similarly high
> children counts. But this specific node always fails.
> 
> Googling led me to see that Mahadev dealt with this last year:
> http://www.mail-archive.com/zookeeper-commits@hadoop.apache.org/msg00175.html
> 
> Source diving led me to see that ClientCnxn enforces a bound based on
> the jute.maxbuffer setting:
> 
>> packetLen = Integer.getInteger("jute.maxbuffer", 4096 * 1024);
> 
> ...
> 
> if (len < 0 || len >= packetLen) {
> 
>   throw new IOException("Packet len" + len + " is out of range!");
> 
> 
> So maybe I could bump this up in config... but, I'm confused when reading
> the documentation on jute.maxbuffer:
> "It specifies the maximum size of the data that can be stored in a znode."
> 
> It's true we have an extremely high node count. However, we've been careful
> to keep each node's data very small -- e.g., we certainly should have no
> single data entry longer than 256 characters.  The way I'm reading the docs,
> the jute.maxbuffer bound is purely against the data size of specific nodes,
> and shouldn't relate to child count. Or does it relate to child count as
> well?
> 
> Here is a stat on the offending node:
> 
> cZxid = 0x10000000e
> 
> ctime = Mon May 03 17:40:58 PDT 2010
> 
> mZxid = 0x10000000e
> 
> mtime = Mon May 03 17:40:58 PDT 2010
> 
> pZxid = 0x100315064
> 
> cversion = 150654
> 
> dataVersion = 0
> 
> aclVersion = 0
> 
> ephemeralOwner = 0x0
> 
> dataLength = 0
> 
> numChildren = 150372
> 
> 
> Thanks for any insights...
> 
> 
> Aaron