You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@curator.apache.org by "Szekeres, Zoltan" <Zo...@morganstanley.com> on 2016/05/31 10:16:08 UTC

Use-case with lots of child nodes

Hi Curator users,


I have a use-case where I need to create a very large number (~70,000)  of child nodes under a parent. These nodes themselves contain no data and will only have a handful of child nodes themselves.

e.g.

/someparentNode/LotsOfChildNodesHere-1/ACoupleofNodesAtThisLevel

/someparentNode/LotsOfChildNodesHere-2/ACoupleofNodesAtThisLevel

...

/someparentNode/LotsOfChildNodesHere-70000/ACoupleofNodesAtThisLevel



I've read (https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html) there is a limit of 1 MB. But I hit the limit for the getChildren operation around 4 MB. I'm interested in what's causing the difference in the limit.



To give more detail I have a primary and secondary use-case:

My primary use-case includes having watchers on the children of "/someparentNode" and requesting getChildren for "/someparentNode/LotsOfChildNodesHere-N" (which only has a couple child nodes).

My secondary use-case would be requesting the children of "/someparentNode", which would be only occasionally for reporting purposes (which has a lot of child nodes and probably won't be as much as 70k nodes, but I hit the limit there).



I'm looking for answers for the following questions:

What are the stability issues that you think might occur having lots of nodes under one node, even if we read them rarely?

Can I reliable use the "jute.maxbuffer" system property on the client in the future?

Looking for answers whether the asymmetry of the default value on client side and on server side is accidental or intentional.



Any advice is much appreciated.


Thanks,

Zoltan Szekeres


________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

RE: Use-case with lots of child nodes

Posted by "Szekeres, Zoltan" <Zo...@morganstanley.com>.
Thanks Jordan for the response.

Zoltan

From: Jordan Zimmerman [mailto:jordan@jordanzimmerman.com]
Sent: Tuesday, May 31, 2016 6:31 PM
To: user@curator.apache.org
Cc: Hejj, Botond (Enterprise Infrastructure); Gupta, Abhishek (IST); Erdei, Andras (IST)
Subject: Re: Use-case with lots of child nodes

The ZK jute limit is part of their protocol. Any API usage is limited to 1MB (unless you increase it). In the case of getChildren(), that would be the value of any overheard for serialization plus the concatenated length of the children names. The TL;DR is that you shouldn’t create lots of children in ZK. Redesign your algorithm to spread out the children. This is a ZK issue and not a Curator issue.

-JZ

On May 31, 2016, at 5:16 AM, Szekeres, Zoltan <Zo...@morganstanley.com>> wrote:


Hi Curator users,

I have a use-case where I need to create a very large number (~70,000)  of child nodes under a parent. These nodes themselves contain no data and will only have a handful of child nodes themselves.
e.g.
/someparentNode/LotsOfChildNodesHere-1/ACoupleofNodesAtThisLevel
/someparentNode/LotsOfChildNodesHere-2/ACoupleofNodesAtThisLevel
...
/someparentNode/LotsOfChildNodesHere-70000/ACoupleofNodesAtThisLevel

I've read (https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html) there is a limit of 1 MB. But I hit the limit for the getChildren operation around 4 MB. I'm interested in what's causing the difference in the limit.

To give more detail I have a primary and secondary use-case:
My primary use-case includes having watchers on the children of "/someparentNode" and requesting getChildren for "/someparentNode/LotsOfChildNodesHere-N" (which only has a couple child nodes).
My secondary use-case would be requesting the children of "/someparentNode", which would be only occasionally for reporting purposes (which has a lot of child nodes and probably won't be as much as 70k nodes, but I hit the limit there).

I'm looking for answers for the following questions:
What are the stability issues that you think might occur having lots of nodes under one node, even if we read them rarely?
Can I reliable use the "jute.maxbuffer" system property on the client in the future?
Looking for answers whether the asymmetry of the default value on client side and on server side is accidental or intentional.

Any advice is much appreciated.

Thanks,
Zoltan Szekeres


________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.



________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

Re: Use-case with lots of child nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
The ZK jute limit is part of their protocol. Any API usage is limited to 1MB (unless you increase it). In the case of getChildren(), that would be the value of any overheard for serialization plus the concatenated length of the children names. The TL;DR is that you shouldn’t create lots of children in ZK. Redesign your algorithm to spread out the children. This is a ZK issue and not a Curator issue. 

-JZ

> On May 31, 2016, at 5:16 AM, Szekeres, Zoltan <Zo...@morganstanley.com> wrote:
> 
> 
> Hi Curator users,
>  
> I have a use-case where I need to create a very large number (~70,000)  of child nodes under a parent. These nodes themselves contain no data and will only have a handful of child nodes themselves.
> e.g.
> /someparentNode/LotsOfChildNodesHere-1/ACoupleofNodesAtThisLevel
> /someparentNode/LotsOfChildNodesHere-2/ACoupleofNodesAtThisLevel
> ...
> /someparentNode/LotsOfChildNodesHere-70000/ACoupleofNodesAtThisLevel
>  
> I've read (https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html <https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html>) there is a limit of 1 MB. But I hit the limit for the getChildren operation around 4 MB. I'm interested in what's causing the difference in the limit.
>  
> To give more detail I have a primary and secondary use-case:
> My primary use-case includes having watchers on the children of "/someparentNode" and requesting getChildren for "/someparentNode/LotsOfChildNodesHere-N" (which only has a couple child nodes).
> My secondary use-case would be requesting the children of "/someparentNode", which would be only occasionally for reporting purposes (which has a lot of child nodes and probably won't be as much as 70k nodes, but I hit the limit there).
>  
> I'm looking for answers for the following questions:
> What are the stability issues that you think might occur having lots of nodes under one node, even if we read them rarely?
> Can I reliable use the "jute.maxbuffer" system property on the client in the future?
> Looking for answers whether the asymmetry of the default value on client side and on server side is accidental or intentional.
>  
> Any advice is much appreciated.
>  
> Thanks,
> Zoltan Szekeres
> 
> 
> 
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies; do not disclose, use or act upon the information; and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers <http://www.morganstanley.com/disclaimers> If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.