You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Patrick Hunt (JIRA)" <ji...@apache.org> on 2009/07/09 18:40:15 UTC

[jira] Created: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

bad testRetry in cppunit tests (hudson failure)
-----------------------------------------------

                 Key: ZOOKEEPER-460
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
             Project: Zookeeper
          Issue Type: Bug
          Components: c client, tests
            Reporter: Patrick Hunt
            Assignee: Henry Robinson
             Fix For: 3.2.1, 3.3.0


the followng code failed on hudson
http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/

      watchctx_t ctx1, ctx2;
      zhandle_t *zk1 = createClient(&ctx1);
      CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
      zhandle_t *zk2 = createClient(&ctx2);
      zookeeper_close(zk1);
      CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));

there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.

this is not correct: createclient is an async call an in some cases the connection can be established before
create client returns.

this shows a failure in this case because client1 was created, then client2 attempted to connect
but failed due to this on the server (max conn exceeded):
        sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());

conn 2 failed and therefore the following assert eventually failed.

this code should not assume that close(1) will beat connect(2)


Henry can you take a look?



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Henry Robinson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731030#action_12731030 ] 

Henry Robinson commented on ZOOKEEPER-460:
------------------------------------------

I'll take a look.

> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732185#action_12732185 ] 

Patrick Hunt commented on ZOOKEEPER-460:
----------------------------------------

ZOOKEEPER-473 should fix a lot of this, esp the *LE tests (socket reuse)


> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Henry Robinson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731380#action_12731380 ] 

Henry Robinson commented on ZOOKEEPER-460:
------------------------------------------

I need a little help getting to the bottom of this (I might be misreading Hudson's logs).

The code in question is, I think, 'ok' (although a bit dodgy). The idea is to test the ability of a client - that is waiting because the max cnxns limit has been reached - to reconnect once a slot becomes free on the server. So ideally for this test close(1) should happen after createclient(2) has connected. As you say, this is a false assumption as the close might happen before the createClient(2) succeeds so there is no contention, but this should only be giving false positives - the second assert should eventually succeed. What I need to do to improve this is to replace createClient with a call that blocks until we at least know the connection attempt has been made, if that's possible.

However the most recent Hudson failures don't seem to be related. From build 375:

[exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset : assertion
     [exec] Zookeeper_watchers::testDefaultSessionWatcher1 : OK
     [exec] Zookeeper_watchers::testDefaultSessionWatcher2 : OK
     [exec] Zookeeper_watchers::testObjectSessionWatcher1 : OK
     [exec] Zookeeper_watchers::testObjectSessionWatcher2 : OK
     [exec] Zookeeper_watchers::testNodeWatcher1 : OK
     [exec] Zookeeper_watchers::testChildWatcher1 : OK
     [exec] Zookeeper_watchers::testChildWatcher2 : OK
     [exec] 
     [exec] /home/hudson/hudson-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestClient.cc:289: Assertion: equality assertion failed [Expected: -101, Actual  : -4]
     [exec] Failures !!!
     [exec] Run: 32   Failure total: 1   Failures: 1   Errors: 0
     [exec] make: *** [run-check] Error 1

and the same from 376 (yesterday's build). These are failing in TestClient (specifically testAsyncWatcherAutoReset). The error here is that a stat completion callback is getting called with ZCONNECTIONLOSS, but is expecting to see ZNONODE, and the assert is failing.

This test runs fine for me locally, so is the problem a heavily loaded Hudson, causing the connection loss?

Similarly the failed build you point to, 371, fails TestClientRetry with a broken pipe error which to my novice eye sounds a bit like something falling over under load.

It looks to me right now like the TestClientRetry code needs improving, but is benign as it should only cause false positives, and we need to understand the reasons why TestClient is failing. Does that sound right?

> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-460:
------------------------------------

    Assignee: Mahadev konar  (was: Henry Robinson)

> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Mahadev konar
>             Fix For: 3.2.1, 3.3.0
>
>         Attachments: zookeeper-460.patch
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731026#action_12731026 ] 

Patrick Hunt commented on ZOOKEEPER-460:
----------------------------------------

the build has been failing for the past 6 days, this is very bad -- in effect no CI

Henry, can you look at this or should I?


> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733312#action_12733312 ] 

Mahadev konar commented on ZOOKEEPER-460:
-----------------------------------------

the reason the tests are failing is because the servers are not able to start for cppunit tests.  the following is the exception on servers run via ccppunit tests - 

{code}
CLOVER] FATAL ERROR: Clover could not be initialised. Are you sure you have Clover in the runtime classpath? (class java.lang.NoClassDefFoundError:com_cenqua_clover/CloverVersionInfo)
Exception in thread "main" java.lang.NoClassDefFoundError: com_cenqua_clover/CoverageRecorder
    at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:48)
Caused by: java.lang.ClassNotFoundException: com_cenqua_clover.CoverageRecorder
    at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader
{code}


> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732182#action_12732182 ] 

Mahadev konar commented on ZOOKEEPER-460:
-----------------------------------------

each time it seems we have a different reason for tests failing. This is bad. 
On 377 build 
http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/377/

the tests fail because of failure in FLENewEpochTest, other times its mostly the c tests.


> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731616#action_12731616 ] 

Patrick Hunt commented on ZOOKEEPER-460:
----------------------------------------

Hm, I don't have access to run code on hudson, Mahadev does though, I'll check with him.

> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Giridharan Kesavan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Giridharan Kesavan updated ZOOKEEPER-460:
-----------------------------------------

    Attachment: zookeeper-460.patch

this should fix the clover classpath issue

> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>         Attachments: zookeeper-460.patch
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt resolved ZOOKEEPER-460.
------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

this patch fixed the problem on hudson, thanks Giri!


> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Mahadev konar
>             Fix For: 3.2.1, 3.3.0
>
>         Attachments: zookeeper-460.patch
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732175#action_12732175 ] 

Mahadev konar commented on ZOOKEEPER-460:
-----------------------------------------

i just ran on the tests on vesta manually and it passes on the machine. It seems to pass for me on the machine :

I see this on bulild 371 -
{code}
 [exec] make: *** [run-check] Broken pipe
 [exec] Running Zookeeper_clientretry::testRetry
{code}

this looks more like the c api generating SIGPIPE and the c tests crashing on that. All we will need to do is ignore the SIGPIPE. Let me check the others for what kind of error they have.

> bad testRetry in cppunit tests (hudson failure)
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-460
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client, tests
>            Reporter: Patrick Hunt
>            Assignee: Henry Robinson
>             Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>       watchctx_t ctx1, ctx2;
>       zhandle_t *zk1 = createClient(&ctx1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>       zhandle_t *zk2 = createClient(&ctx2);
>       zookeeper_close(zk1);
>       CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 attempted to connect
> but failed due to this on the server (max conn exceeded):
>         sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.