You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Mathias Hodler <ma...@gmail.com> on 2013/04/08 12:32:54 UTC

Zookeeper does not clean up deleted nodes

Hi,

I made some tests and it seems like zookeeper doesn't clean up the last 500
deleted nodes.

In my test I created nodes and deleted each node after it was created. I
repeated this step 1000 times and then triggered a full gc. These are the
results

Creating 1000 Nodes and deleting 1000 Nodes and each node has...
...1000kb data = 529MB heap used after FullGC
...500kb data = 281MB heap used after FullGC
...256kb data = 140MB heap used after FullGC
...128kb data =  68MB heap used after FullGC

If I'm creating 1000 nodes with each 1000kb data and then deleting the
nodes and after that creating 1000 nodes with 128kb data and deleting the
nodes again, 68MB heap space is used.

So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes.

Is this a bug or are there configuration parameter to change that behaviour?

Fwd: Zookeeper does not clean up deleted nodes

Posted by Mathias Hodler <ma...@gmail.com>.
Hi,

I've made some tests and it seems like zookeeper doesn't clean up the last
500 deleted nodes.

In my test I created nodes and deleted each node after it was created. I
repeated this step 1000 times and then triggered a full gc. These are the
results

Creating 1000 Nodes and deleting 1000 Nodes and each node has...
...1000kb data = 529MB heap used after FullGC
...500kb data = 281MB heap used after FullGC
...256kb data = 140MB heap used after FullGC
...128kb data =  68MB heap used after FullGC

If I'm creating 1000 nodes with each 1000kb data and then deleting the
nodes and after that creating 1000 nodes with 128kb data and deleting the
nodes again, 68MB heap space is used.

So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes.

Is this a bug or are there configuration parameter to change that behaviour?

Re: Zookeeper does not clean up deleted nodes

Posted by Henry Robinson <he...@cloudera.com>.
As Ben says, this is a feature, not a bug. However, the memory usage is
still excessive; see this jira:
https://issues.apache.org/jira/browse/ZOOKEEPER-1473

Henry

On 8 April 2013 09:31, Benjamin Reed <br...@apache.org> wrote:

> are you looking at the leader or the follower? the leader keeps the last
> few transactions in memory to speed up syncing with new followers. that
> might be what you are seeing.
>
>
> On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler <mathias.hodler@gmail.com
> >wrote:
>
> > Hi,
> >
> > I made some tests and it seems like zookeeper doesn't clean up the last
> 500
> > deleted nodes.
> >
> > In my test I created nodes and deleted each node after it was created. I
> > repeated this step 1000 times and then triggered a full gc. These are the
> > results
> >
> > Creating 1000 Nodes and deleting 1000 Nodes and each node has...
> > ...1000kb data = 529MB heap used after FullGC
> > ...500kb data = 281MB heap used after FullGC
> > ...256kb data = 140MB heap used after FullGC
> > ...128kb data =  68MB heap used after FullGC
> >
> > If I'm creating 1000 nodes with each 1000kb data and then deleting the
> > nodes and after that creating 1000 nodes with 128kb data and deleting the
> > nodes again, 68MB heap space is used.
> >
> > So it seems Zookeeper caches / doesn't clean up the last 500 deleted
> nodes.
> >
> > Is this a bug or are there configuration parameter to change that
> > behaviour?
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Re: Zookeeper does not clean up deleted nodes

Posted by Mathias Hodler <ma...@gmail.com>.
I created a new issue https://issues.apache.org/jira/browse/ZOOKEEPER-1687


2013/4/8 Benjamin Reed <br...@apache.org>

> it would be very simple to make that 500 configurable. you should propose a
> change.
>
>
> On Mon, Apr 8, 2013 at 9:40 AM, Mathias Hodler <mathias.hodler@gmail.com
> >wrote:
>
> > Thanks, this could be the reason. I only used a single zookeeper server,
> so
> > it should act as a leader.
> >
> > So if I need to store larger files (about 1MB) the only option is to
> > increase the heap space? I know that zookeeper is designed for small
> files,
> > but I'm using zookeeper with solr and solr stores all the index
> > configuration with large dictionaries in zookeeper.
> >
> >
> > 2013/4/8 Benjamin Reed <br...@apache.org>
> >
> > > are you looking at the leader or the follower? the leader keeps the
> last
> > > few transactions in memory to speed up syncing with new followers. that
> > > might be what you are seeing.
> > >
> > >
> > > On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler <
> mathias.hodler@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > I made some tests and it seems like zookeeper doesn't clean up the
> last
> > > 500
> > > > deleted nodes.
> > > >
> > > > In my test I created nodes and deleted each node after it was
> created.
> > I
> > > > repeated this step 1000 times and then triggered a full gc. These are
> > the
> > > > results
> > > >
> > > > Creating 1000 Nodes and deleting 1000 Nodes and each node has...
> > > > ...1000kb data = 529MB heap used after FullGC
> > > > ...500kb data = 281MB heap used after FullGC
> > > > ...256kb data = 140MB heap used after FullGC
> > > > ...128kb data =  68MB heap used after FullGC
> > > >
> > > > If I'm creating 1000 nodes with each 1000kb data and then deleting
> the
> > > > nodes and after that creating 1000 nodes with 128kb data and deleting
> > the
> > > > nodes again, 68MB heap space is used.
> > > >
> > > > So it seems Zookeeper caches / doesn't clean up the last 500 deleted
> > > nodes.
> > > >
> > > > Is this a bug or are there configuration parameter to change that
> > > > behaviour?
> > > >
> > >
> >
>

Re: Zookeeper does not clean up deleted nodes

Posted by Benjamin Reed <br...@apache.org>.
it would be very simple to make that 500 configurable. you should propose a
change.


On Mon, Apr 8, 2013 at 9:40 AM, Mathias Hodler <ma...@gmail.com>wrote:

> Thanks, this could be the reason. I only used a single zookeeper server, so
> it should act as a leader.
>
> So if I need to store larger files (about 1MB) the only option is to
> increase the heap space? I know that zookeeper is designed for small files,
> but I'm using zookeeper with solr and solr stores all the index
> configuration with large dictionaries in zookeeper.
>
>
> 2013/4/8 Benjamin Reed <br...@apache.org>
>
> > are you looking at the leader or the follower? the leader keeps the last
> > few transactions in memory to speed up syncing with new followers. that
> > might be what you are seeing.
> >
> >
> > On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler <mathias.hodler@gmail.com
> > >wrote:
> >
> > > Hi,
> > >
> > > I made some tests and it seems like zookeeper doesn't clean up the last
> > 500
> > > deleted nodes.
> > >
> > > In my test I created nodes and deleted each node after it was created.
> I
> > > repeated this step 1000 times and then triggered a full gc. These are
> the
> > > results
> > >
> > > Creating 1000 Nodes and deleting 1000 Nodes and each node has...
> > > ...1000kb data = 529MB heap used after FullGC
> > > ...500kb data = 281MB heap used after FullGC
> > > ...256kb data = 140MB heap used after FullGC
> > > ...128kb data =  68MB heap used after FullGC
> > >
> > > If I'm creating 1000 nodes with each 1000kb data and then deleting the
> > > nodes and after that creating 1000 nodes with 128kb data and deleting
> the
> > > nodes again, 68MB heap space is used.
> > >
> > > So it seems Zookeeper caches / doesn't clean up the last 500 deleted
> > nodes.
> > >
> > > Is this a bug or are there configuration parameter to change that
> > > behaviour?
> > >
> >
>

Re: Zookeeper does not clean up deleted nodes

Posted by Mathias Hodler <ma...@gmail.com>.
Thanks, this could be the reason. I only used a single zookeeper server, so
it should act as a leader.

So if I need to store larger files (about 1MB) the only option is to
increase the heap space? I know that zookeeper is designed for small files,
but I'm using zookeeper with solr and solr stores all the index
configuration with large dictionaries in zookeeper.


2013/4/8 Benjamin Reed <br...@apache.org>

> are you looking at the leader or the follower? the leader keeps the last
> few transactions in memory to speed up syncing with new followers. that
> might be what you are seeing.
>
>
> On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler <mathias.hodler@gmail.com
> >wrote:
>
> > Hi,
> >
> > I made some tests and it seems like zookeeper doesn't clean up the last
> 500
> > deleted nodes.
> >
> > In my test I created nodes and deleted each node after it was created. I
> > repeated this step 1000 times and then triggered a full gc. These are the
> > results
> >
> > Creating 1000 Nodes and deleting 1000 Nodes and each node has...
> > ...1000kb data = 529MB heap used after FullGC
> > ...500kb data = 281MB heap used after FullGC
> > ...256kb data = 140MB heap used after FullGC
> > ...128kb data =  68MB heap used after FullGC
> >
> > If I'm creating 1000 nodes with each 1000kb data and then deleting the
> > nodes and after that creating 1000 nodes with 128kb data and deleting the
> > nodes again, 68MB heap space is used.
> >
> > So it seems Zookeeper caches / doesn't clean up the last 500 deleted
> nodes.
> >
> > Is this a bug or are there configuration parameter to change that
> > behaviour?
> >
>

Re: Zookeeper does not clean up deleted nodes

Posted by Benjamin Reed <br...@apache.org>.
are you looking at the leader or the follower? the leader keeps the last
few transactions in memory to speed up syncing with new followers. that
might be what you are seeing.


On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler <ma...@gmail.com>wrote:

> Hi,
>
> I made some tests and it seems like zookeeper doesn't clean up the last 500
> deleted nodes.
>
> In my test I created nodes and deleted each node after it was created. I
> repeated this step 1000 times and then triggered a full gc. These are the
> results
>
> Creating 1000 Nodes and deleting 1000 Nodes and each node has...
> ...1000kb data = 529MB heap used after FullGC
> ...500kb data = 281MB heap used after FullGC
> ...256kb data = 140MB heap used after FullGC
> ...128kb data =  68MB heap used after FullGC
>
> If I'm creating 1000 nodes with each 1000kb data and then deleting the
> nodes and after that creating 1000 nodes with 128kb data and deleting the
> nodes again, 68MB heap space is used.
>
> So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes.
>
> Is this a bug or are there configuration parameter to change that
> behaviour?
>