You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Patrick Hunt (JIRA)" <ji...@apache.org> on 2009/10/14 19:30:31 UTC

[jira] Updated: (ZOOKEEPER-528) c client exists() call with watch on large number of nodes (>100k) causes connection loss

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-528:
-----------------------------------

    Release Note: workaround: the test environment in this case had a max heap of 64m, by increasing the max mem via -Xmx the performance issue was addressed and the test ran fine. 

> c client exists() call with watch on large number of nodes (>100k) causes connection loss
> -----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-528
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-528
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.2.0
>            Reporter: Patrick Hunt
>            Assignee: Patrick Hunt
>            Priority: Critical
>             Fix For: 3.3.0
>
>
> If I create 100k nodes on /misc then
>       CPPUNIT_ASSERT_EQUAL(0, zoo_get_children(zh2, "/misc", 0, &children));
>       for (int i = 0; i < children.count; i++) {
>         sprintf(path, "/misc/%s", children.data[i]);
>         CPPUNIT_ASSERT_EQUAL(0, zoo_exists(zh2, path, 1, &stat));
>         CPPUNIT_ASSERT_EQUAL(0, zoo_wexists(zh3, path, watcher, &ctx3, &stat));
>       }
> around 47k or so through the loop the client fails with -4 (connection loss), the client timeout is 30 seconds. The server command port shows the following, so it looks like it's not the server but some issue with watcher reg on the c client?
> phunt@valhalla:~$ echo stat | nc localhost 22181
> Zookeeper version: 3.3.0--1, built on 07/22/2009 23:55 GMT
> Clients:
>  /127.0.0.1:45729[1](queued=0,recved=100024,sent=0)
>  /127.0.0.1:50229[1](queued=0,recved=0,sent=0)
>  /127.0.0.1:45731[1](queued=0,recved=47116,sent=0)
>  /127.0.0.1:45730[1](queued=0,recved=47117,sent=1)
> Latency min/avg/max: 0/196/1026
> Received: 194257
> Sent: 1
> Outstanding: 0
> Zxid: 0x186a4
> Mode: standalone
> Node count: 100005
> 729 is a separate client - the one that created the nodes originally.
> 731 and 730 are zh2/zh3 in the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.