You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Victor <vi...@yahoo.com> on 2011/06/01 08:00:36 UTC

Memory leak in zookeeper 3.3.2 and 3.3.3?

Hi, 
I apologize for the broadcasting.  I searched the archive before I send this email to the mailing list. 
 
 We are using Cages library + Zookeeper 3.3.3 to synchronize creation of forum user name (which needs to be unique).  
  This is the only use case that we write in ZooKeeper. 

  We used Cages ZkWriteLock to obtain write lock and below is the code (which is very straightforward):
    org.wyki.zookeeper.cages.ZkWriteLock lock = new org.wyki.zookeeper.cages.ZkWriteLock("/User/ForumUsername/" + forumUsername);
            try {
                boolean lockAcquired = lock.acquire(5, TimeUnit.SECONDS);
    ......
            } finally {
                lock.release();
            }
   
  In our load test (1 single Zookeeper server), even with 3GB max heap size, ZooKeeper runs out of memory after ~3 hours.
  So I decided to profile Zookeeper with YourKit Jave profile (9.5) against one Zookeeper server.   After every 6 above calls to ZooKeeper (I randomly picked 6),  I saw the memory usage increased ~200K (in Yourkit retained size increased ~220K and shallow size increased 
  Even after 1 hour or more, even if I forced garbage collection (done in YourKit), the memory increased due to the 6 calls didn't get released. I ran the 6 calls for a few times and observed the same. So I suspect there is a memory leak (and that is why we got OutOfMemory in load test)
  Looking further at the hotspot using Yourkit, the memory increase are in the form of String, char[], Class, HashMap$Entry (java classes or types) and maily from below method invocation:
  
  org.apache.zookeeper.server.FinalRequestProcessor.processRequest(Request)
  org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
  org.apache.zookeeper.server.NIOServerCnxn$Factory.run()
  
  But from the code, leakage is not obvious.
  
  
  We used below JVM (for ZooKeeper) startup options/flags:
  -server -Xms1536m -Xmx3072m -Xloggc:/var/zookeeper/logs/gc.log -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=54321 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false
  
    I looked at ZooKeeper documentation thoroughly (especially the administration guide), but couldn't find a way to tune this (to avoid above suspected memory leak)
   
   Is there a memory leak in Zookeeper 3.3.3 (or 3.3.2)? If there is, How could we configure ZooKeeper to avoid/reduce that leak?    What is the stable version to use? did we misconfigure anything?
   
Please advise or help. Thanks a lot!
 
Victor

RE: Memory leak in zookeeper 3.3.2 and 3.3.3?

Posted by "Fournier, Camille F. [Tech]" <Ca...@gs.com>.
I would bet that Ted is right about this, but if you're still having problems and want to put the yourkit profile up somewhere I could take a look later today.

C

-----Original Message-----
From: Ted Dunning [mailto:ted.dunning@gmail.com] 
Sent: Wednesday, June 01, 2011 2:07 AM
To: user@zookeeper.apache.org
Subject: Re: Memory leak in zookeeper 3.3.2 and 3.3.3?

What happens if you stop the client (either an orderly shutdown, closing ZK
or a hard stop with enough time for ephemerals to go away)?

Does GC then reclaim the memory?

What does the dump command show in terms of how many connections and
ephemerals there are?

What does ls in the command line client show for how many znodes there are?

Usually when I see this sort of behavior means that I have been accumulating
data in ZK in a way that I didn't intend.  I have had ZK up for months to
years without seeing this behavior.

On Tue, May 31, 2011 at 11:00 PM, Victor <vi...@yahoo.com> wrote:

> Hi,
> I apologize for the broadcasting.  I searched the archive before I send
> this email to the mailing list.
>
>  We are using Cages library + Zookeeper 3.3.3 to synchronize creation of
> forum user name (which needs to be unique).
>   This is the only use case that we write in ZooKeeper.
>
>   We used Cages ZkWriteLock to obtain write lock and below is the code
> (which is very straightforward):
>     org.wyki.zookeeper.cages.ZkWriteLock lock = new
> org.wyki.zookeeper.cages.ZkWriteLock("/User/ForumUsername/" +
> forumUsername);
>             try {
>                 boolean lockAcquired = lock.acquire(5, TimeUnit.SECONDS);
>     ......
>             } finally {
>                 lock.release();
>             }
>
>   In our load test (1 single Zookeeper server), even with 3GB max heap
> size, ZooKeeper runs out of memory after ~3 hours.
>   So I decided to profile Zookeeper with YourKit Jave profile (9.5) against
> one Zookeeper server.   After every 6 above calls to ZooKeeper (I randomly
> picked 6),  I saw the memory usage increased ~200K (in Yourkit retained size
> increased ~220K and shallow size increased
>   Even after 1 hour or more, even if I forced garbage collection (done in
> YourKit), the memory increased due to the 6 calls didn't get released. I ran
> the 6 calls for a few times and observed the same. So I suspect there is a
> memory leak (and that is why we got OutOfMemory in load test)
>   Looking further at the hotspot using Yourkit, the memory increase are in
> the form of String, char[], Class, HashMap$Entry (java classes or types) and
> maily from below method invocation:
>
>   org.apache.zookeeper.server.FinalRequestProcessor.processRequest(Request)
>   org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
>   org.apache.zookeeper.server.NIOServerCnxn$Factory.run()
>
>   But from the code, leakage is not obvious.
>
>
>   We used below JVM (for ZooKeeper) startup options/flags:
>   -server -Xms1536m -Xmx3072m -Xloggc:/var/zookeeper/logs/gc.log
> -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime
> -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
> -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.port=54321 -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
>
>     I looked at ZooKeeper documentation thoroughly (especially the
> administration guide), but couldn't find a way to tune this (to avoid above
> suspected memory leak)
>
>    Is there a memory leak in Zookeeper 3.3.3 (or 3.3.2)? If there is, How
> could we configure ZooKeeper to avoid/reduce that leak?    What is the
> stable version to use? did we misconfigure anything?
>
> Please advise or help. Thanks a lot!
>
> Victor

Re: Memory leak in zookeeper 3.3.2 and 3.3.3?

Posted by Victor <vi...@yahoo.com>.
Hi Ted,
 
Many thanks for your response. I was investigating a production issue and therefore not responding your earlier. Sorry about that.
 
I tried dump and stat commands via telnet and below is the output:
*************dump*************:
  -logbash-3.2$ telnet localhost 2181
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
dump
SessionTracker dump:
Session Sets (3):
0 expire at Wed Jun 01 20:11:44 GMT+00:00 2011:
0 expire at Wed Jun 01 20:11:46 GMT+00:00 2011:
2 expire at Wed Jun 01 20:11:48 GMT+00:00 2011:
        0x13047d8de8a0000
        0x13047d8de8a0001
ephemeral nodes dump:
Sessions with Ephemerals (1):
0x13047d8de8a0001:
Connection closed by foreign host.
 
*************stat**************
-logbash-3.2$ telnet localhost 2181
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
stat
Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
Clients:
 /10.151.78.31:47473[1](queued=0,recved=29879,sent=29879)
 /127.0.0.1:41928[0](queued=0,recved=1,sent=0)
 /10.151.74.36:18484[1](queued=0,recved=30067,sent=30067)
Latency min/avg/max: 0/0/61
Received: 59951
Sent: 59950
Outstanding: 0
Zxid: 0xeea794
Mode: standalone
Node count: 246
Connection closed by foreign host.

I forgot to mention yesterday that we also monitor ZK once every minute by making below call:
ZkSessionManager.instance().getZooKeeper().exists("/DependencyCheck", false);
This is just a read so I don't think it really has memory impact. Please let me know otherwise though.
 
I will try shutting down ZK client and watch the memory via YourKit profiler. Will post afterwards.
 
Thanks,
Victor
 
 

--- On Wed, 6/1/11, Ted Dunning <te...@gmail.com> wrote:


From: Ted Dunning <te...@gmail.com>
Subject: Re: Memory leak in zookeeper 3.3.2 and 3.3.3?
To: user@zookeeper.apache.org
Date: Wednesday, June 1, 2011, 2:07 AM


What happens if you stop the client (either an orderly shutdown, closing ZK
or a hard stop with enough time for ephemerals to go away)?

Does GC then reclaim the memory?

What does the dump command show in terms of how many connections and
ephemerals there are?

What does ls in the command line client show for how many znodes there are?

Usually when I see this sort of behavior means that I have been accumulating
data in ZK in a way that I didn't intend.  I have had ZK up for months to
years without seeing this behavior.

On Tue, May 31, 2011 at 11:00 PM, Victor <vi...@yahoo.com> wrote:

> Hi,
> I apologize for the broadcasting.  I searched the archive before I send
> this email to the mailing list.
>
>  We are using Cages library + Zookeeper 3.3.3 to synchronize creation of
> forum user name (which needs to be unique).
>   This is the only use case that we write in ZooKeeper.
>
>   We used Cages ZkWriteLock to obtain write lock and below is the code
> (which is very straightforward):
>     org.wyki.zookeeper.cages.ZkWriteLock lock = new
> org.wyki.zookeeper.cages.ZkWriteLock("/User/ForumUsername/" +
> forumUsername);
>             try {
>                 boolean lockAcquired = lock.acquire(5, TimeUnit.SECONDS);
>     ......
>             } finally {
>                 lock.release();
>             }
>
>   In our load test (1 single Zookeeper server), even with 3GB max heap
> size, ZooKeeper runs out of memory after ~3 hours.
>   So I decided to profile Zookeeper with YourKit Jave profile (9.5) against
> one Zookeeper server.   After every 6 above calls to ZooKeeper (I randomly
> picked 6),  I saw the memory usage increased ~200K (in Yourkit retained size
> increased ~220K and shallow size increased
>   Even after 1 hour or more, even if I forced garbage collection (done in
> YourKit), the memory increased due to the 6 calls didn't get released. I ran
> the 6 calls for a few times and observed the same. So I suspect there is a
> memory leak (and that is why we got OutOfMemory in load test)
>   Looking further at the hotspot using Yourkit, the memory increase are in
> the form of String, char[], Class, HashMap$Entry (java classes or types) and
> maily from below method invocation:
>
>   org.apache.zookeeper.server.FinalRequestProcessor.processRequest(Request)
>   org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
>   org.apache.zookeeper.server.NIOServerCnxn$Factory.run()
>
>   But from the code, leakage is not obvious.
>
>
>   We used below JVM (for ZooKeeper) startup options/flags:
>   -server -Xms1536m -Xmx3072m -Xloggc:/var/zookeeper/logs/gc.log
> -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime
> -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
> -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.port=54321 -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
>
>     I looked at ZooKeeper documentation thoroughly (especially the
> administration guide), but couldn't find a way to tune this (to avoid above
> suspected memory leak)
>
>    Is there a memory leak in Zookeeper 3.3.3 (or 3.3.2)? If there is, How
> could we configure ZooKeeper to avoid/reduce that leak?    What is the
> stable version to use? did we misconfigure anything?
>
> Please advise or help. Thanks a lot!
>
> Victor

Re: Memory leak in zookeeper 3.3.2 and 3.3.3?

Posted by Ted Dunning <te...@gmail.com>.
What happens if you stop the client (either an orderly shutdown, closing ZK
or a hard stop with enough time for ephemerals to go away)?

Does GC then reclaim the memory?

What does the dump command show in terms of how many connections and
ephemerals there are?

What does ls in the command line client show for how many znodes there are?

Usually when I see this sort of behavior means that I have been accumulating
data in ZK in a way that I didn't intend.  I have had ZK up for months to
years without seeing this behavior.

On Tue, May 31, 2011 at 11:00 PM, Victor <vi...@yahoo.com> wrote:

> Hi,
> I apologize for the broadcasting.  I searched the archive before I send
> this email to the mailing list.
>
>  We are using Cages library + Zookeeper 3.3.3 to synchronize creation of
> forum user name (which needs to be unique).
>   This is the only use case that we write in ZooKeeper.
>
>   We used Cages ZkWriteLock to obtain write lock and below is the code
> (which is very straightforward):
>     org.wyki.zookeeper.cages.ZkWriteLock lock = new
> org.wyki.zookeeper.cages.ZkWriteLock("/User/ForumUsername/" +
> forumUsername);
>             try {
>                 boolean lockAcquired = lock.acquire(5, TimeUnit.SECONDS);
>     ......
>             } finally {
>                 lock.release();
>             }
>
>   In our load test (1 single Zookeeper server), even with 3GB max heap
> size, ZooKeeper runs out of memory after ~3 hours.
>   So I decided to profile Zookeeper with YourKit Jave profile (9.5) against
> one Zookeeper server.   After every 6 above calls to ZooKeeper (I randomly
> picked 6),  I saw the memory usage increased ~200K (in Yourkit retained size
> increased ~220K and shallow size increased
>   Even after 1 hour or more, even if I forced garbage collection (done in
> YourKit), the memory increased due to the 6 calls didn't get released. I ran
> the 6 calls for a few times and observed the same. So I suspect there is a
> memory leak (and that is why we got OutOfMemory in load test)
>   Looking further at the hotspot using Yourkit, the memory increase are in
> the form of String, char[], Class, HashMap$Entry (java classes or types) and
> maily from below method invocation:
>
>   org.apache.zookeeper.server.FinalRequestProcessor.processRequest(Request)
>   org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
>   org.apache.zookeeper.server.NIOServerCnxn$Factory.run()
>
>   But from the code, leakage is not obvious.
>
>
>   We used below JVM (for ZooKeeper) startup options/flags:
>   -server -Xms1536m -Xmx3072m -Xloggc:/var/zookeeper/logs/gc.log
> -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime
> -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
> -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.port=54321 -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
>
>     I looked at ZooKeeper documentation thoroughly (especially the
> administration guide), but couldn't find a way to tune this (to avoid above
> suspected memory leak)
>
>    Is there a memory leak in Zookeeper 3.3.3 (or 3.3.2)? If there is, How
> could we configure ZooKeeper to avoid/reduce that leak?    What is the
> stable version to use? did we misconfigure anything?
>
> Please advise or help. Thanks a lot!
>
> Victor