You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Patrick Hunt <ph...@apache.org> on 2011/07/22 19:58:18 UTC

Out of memory running ZK unit tests against trunk

I've never seen this before, but in my CI environment (sun jdk
1.6.0_20) I'm seeing some intermittent failures such as the following.

Has anyone added/modified tests for 3.4.0 that might be using more
threads/memory than previously? Creating ZK clients but not closing
them, etc...

java.lang.OutOfMemoryError: unable to create new native thread
       at java.lang.Thread.start0(Native Method)
       at java.lang.Thread.start(Thread.java:597)
       at org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114)
       at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
       at org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
       at org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39)

Patrick

Re: Out of memory running ZK unit tests against trunk

Posted by Patrick Hunt <ph...@apache.org>.
Hi Laxman, you want to take a stab at it?
https://issues.apache.org/jira/browse/ZOOKEEPER-1140

Can you followup with Flavio/Henry about the readonly issue? Shouldn't
such a feature only be enabled when R/O support is enabled? (my
assumption is that it should be off by default, on via configuration
option)

Patrick

On Fri, Jul 29, 2011 at 7:00 AM, Laxman <la...@huawei.com> wrote:
> In QuorumPeer, when the peer is in LOOKING state we are starting
> ReadOnlyZooKeeperServer in a separate thread. And we are shutting down this
> server even before startup which has no effect. Also, as this is not a
> blocking call QP keeps on spawning new servers.
>
> 1) ReadOnlyZooKeeperServer.startup() need not be called in separate a
> thread.
> 2) ReadOnlyZooKeeperServer.startup() is not a blocking call. Need to
> introduce a method like Leader.lead(), Follower.followLeader()
> 3) Shutdown should be called only after the a/m blocking call is returned.
>
>
> -----Original Message-----
> From: Patrick Hunt [mailto:phunt@apache.org]
> Sent: Friday, July 29, 2011 6:24 AM
> To: dev@zookeeper.apache.org
> Subject: Re: Out of memory running ZK unit tests against trunk
>
> Near the end of this test (QuorumZxidSyncTest) there are tons of
> threads running - 115 "ProcessThread" threads, similar numbers of
> SessionTracker.
>
> Also I see ~100 ReadOnlyRequestProcessor - why is this running as a
> separate thread? (henry/flavio?)
>
> Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect
> that the server shutdown is not shutting down fully for some reason.
>
> Patrick
>
> On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar <ma...@hortonworks.com>
> wrote:
>> Nice find Pat. I cant see a reason on why that should happen. Can we
>> just do a stack dump and compare?
>>
>> thanks
>> mahadev
>>
>> On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt <ph...@apache.org> wrote:
>>> I tracked this down to a low ulimit setting on the particular jenkins
>>> host where this was failing (max processes).
>>>
>>> Specifically the following test was failing on trunk, but not on
>>> branch 3_3, which concerns me
>>>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>>>
>>> there haven't been any real changes to this test between versions, any
>>> insight into why the server is using more threads in trunk vs
>>> branch33?
>>>
>>> Patrick
>>>
>>> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt <ph...@apache.org> wrote:
>>>> I've never seen this before, but in my CI environment (sun jdk
>>>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>>>
>>>> Has anyone added/modified tests for 3.4.0 that might be using more
>>>> threads/memory than previously? Creating ZK clients but not closing
>>>> them, etc...
>>>>
>>>> java.lang.OutOfMemoryError: unable to create new native thread
>>>>       at java.lang.Thread.start0(Native Method)
>>>>       at java.lang.Thread.start(Thread.java:597)
>>>>       at
> org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.
> java:114)
>>>>       at
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>>>       at
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>>>       at
> org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:3
> 9)
>>>>
>>>> Patrick
>>>>
>>>
>>
>
>

RE: Out of memory running ZK unit tests against trunk

Posted by Laxman <la...@huawei.com>.
In QuorumPeer, when the peer is in LOOKING state we are starting
ReadOnlyZooKeeperServer in a separate thread. And we are shutting down this
server even before startup which has no effect. Also, as this is not a
blocking call QP keeps on spawning new servers.

1) ReadOnlyZooKeeperServer.startup() need not be called in separate a
thread.
2) ReadOnlyZooKeeperServer.startup() is not a blocking call. Need to
introduce a method like Leader.lead(), Follower.followLeader()
3) Shutdown should be called only after the a/m blocking call is returned.


-----Original Message-----
From: Patrick Hunt [mailto:phunt@apache.org] 
Sent: Friday, July 29, 2011 6:24 AM
To: dev@zookeeper.apache.org
Subject: Re: Out of memory running ZK unit tests against trunk

Near the end of this test (QuorumZxidSyncTest) there are tons of
threads running - 115 "ProcessThread" threads, similar numbers of
SessionTracker.

Also I see ~100 ReadOnlyRequestProcessor - why is this running as a
separate thread? (henry/flavio?)

Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect
that the server shutdown is not shutting down fully for some reason.

Patrick

On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar <ma...@hortonworks.com>
wrote:
> Nice find Pat. I cant see a reason on why that should happen. Can we
> just do a stack dump and compare?
>
> thanks
> mahadev
>
> On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt <ph...@apache.org> wrote:
>> I tracked this down to a low ulimit setting on the particular jenkins
>> host where this was failing (max processes).
>>
>> Specifically the following test was failing on trunk, but not on
>> branch 3_3, which concerns me
>>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>>
>> there haven't been any real changes to this test between versions, any
>> insight into why the server is using more threads in trunk vs
>> branch33?
>>
>> Patrick
>>
>> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt <ph...@apache.org> wrote:
>>> I've never seen this before, but in my CI environment (sun jdk
>>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>>
>>> Has anyone added/modified tests for 3.4.0 that might be using more
>>> threads/memory than previously? Creating ZK clients but not closing
>>> them, etc...
>>>
>>> java.lang.OutOfMemoryError: unable to create new native thread
>>>       at java.lang.Thread.start0(Native Method)
>>>       at java.lang.Thread.start(Thread.java:597)
>>>       at
org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.
java:114)
>>>       at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>>       at
org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>>       at
org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:3
9)
>>>
>>> Patrick
>>>
>>
>


Re: Out of memory running ZK unit tests against trunk

Posted by Patrick Hunt <ph...@apache.org>.
Near the end of this test (QuorumZxidSyncTest) there are tons of
threads running - 115 "ProcessThread" threads, similar numbers of
SessionTracker.

Also I see ~100 ReadOnlyRequestProcessor - why is this running as a
separate thread? (henry/flavio?)

Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect
that the server shutdown is not shutting down fully for some reason.

Patrick

On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar <ma...@hortonworks.com> wrote:
> Nice find Pat. I cant see a reason on why that should happen. Can we
> just do a stack dump and compare?
>
> thanks
> mahadev
>
> On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt <ph...@apache.org> wrote:
>> I tracked this down to a low ulimit setting on the particular jenkins
>> host where this was failing (max processes).
>>
>> Specifically the following test was failing on trunk, but not on
>> branch 3_3, which concerns me
>>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>>
>> there haven't been any real changes to this test between versions, any
>> insight into why the server is using more threads in trunk vs
>> branch33?
>>
>> Patrick
>>
>> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt <ph...@apache.org> wrote:
>>> I've never seen this before, but in my CI environment (sun jdk
>>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>>
>>> Has anyone added/modified tests for 3.4.0 that might be using more
>>> threads/memory than previously? Creating ZK clients but not closing
>>> them, etc...
>>>
>>> java.lang.OutOfMemoryError: unable to create new native thread
>>>       at java.lang.Thread.start0(Native Method)
>>>       at java.lang.Thread.start(Thread.java:597)
>>>       at org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114)
>>>       at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>>       at org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>>       at org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39)
>>>
>>> Patrick
>>>
>>
>

Re: Out of memory running ZK unit tests against trunk

Posted by Mahadev Konar <ma...@hortonworks.com>.
Nice find Pat. I cant see a reason on why that should happen. Can we
just do a stack dump and compare?

thanks
mahadev

On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt <ph...@apache.org> wrote:
> I tracked this down to a low ulimit setting on the particular jenkins
> host where this was failing (max processes).
>
> Specifically the following test was failing on trunk, but not on
> branch 3_3, which concerns me
>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>
> there haven't been any real changes to this test between versions, any
> insight into why the server is using more threads in trunk vs
> branch33?
>
> Patrick
>
> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt <ph...@apache.org> wrote:
>> I've never seen this before, but in my CI environment (sun jdk
>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>
>> Has anyone added/modified tests for 3.4.0 that might be using more
>> threads/memory than previously? Creating ZK clients but not closing
>> them, etc...
>>
>> java.lang.OutOfMemoryError: unable to create new native thread
>>       at java.lang.Thread.start0(Native Method)
>>       at java.lang.Thread.start(Thread.java:597)
>>       at org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114)
>>       at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>       at org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>       at org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39)
>>
>> Patrick
>>
>

Re: Out of memory running ZK unit tests against trunk

Posted by Patrick Hunt <ph...@apache.org>.
I tracked this down to a low ulimit setting on the particular jenkins
host where this was failing (max processes).

Specifically the following test was failing on trunk, but not on
branch 3_3, which concerns me
    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java

there haven't been any real changes to this test between versions, any
insight into why the server is using more threads in trunk vs
branch33?

Patrick

On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt <ph...@apache.org> wrote:
> I've never seen this before, but in my CI environment (sun jdk
> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>
> Has anyone added/modified tests for 3.4.0 that might be using more
> threads/memory than previously? Creating ZK clients but not closing
> them, etc...
>
> java.lang.OutOfMemoryError: unable to create new native thread
>       at java.lang.Thread.start0(Native Method)
>       at java.lang.Thread.start(Thread.java:597)
>       at org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114)
>       at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>       at org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>       at org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39)
>
> Patrick
>