You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "amith (Created) (JIRA)" <ji...@apache.org> on 2011/11/09 10:55:51 UTC

[jira] [Created] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

One of the zookeeper server is not accepting any requests
---------------------------------------------------------

                 Key: ZOOKEEPER-1294
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
         Environment: 3 Zookeeper + 3 Observer with SuSe-11
            Reporter: amith


In zoo.cfg i have configured as
server.1 = XX.XX.XX.XX:65175:65173
server.2 = XX.XX.XX.XX:65185:65183
server.3 = XX.XX.XX.XX:65195:65193
server.4 = XX.XX.XX.XX:65205:65203:observer
server.5 = XX.XX.XX.XX:65215:65213:observer
server.6 = XX.XX.XX.XX:65225:65223:observer

Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
in the cluster of 6 zookeepers

Steps to reproduce the defect
1. Start all the 3 participant zookeeper
2. Stop all the participant zookeeper
3. Start zookeeper 1(Participant)
4. Start zookeeper 2(Participant)
5. Start zookeeper 4(Observer)
6. Create a persistent node with external client and close it
7. Stop the zookeeper 1(Participant neo quorum is unstable)
8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
9. Start the Zookeeper 1 (Participant stabilise the quorum)

Now check the observer using 4 letter word (Server.4)
linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
Clients:
 /127.0.0.1:46370[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 1
Sent: 0
Outstanding: 0
Zxid: 0x100000003
Mode: observer
Node count: 5

check the participant 2 with 4 letter word

Latency min/avg/max: 22/48/83
Received: 39
Sent: 3
Outstanding: 35
Zxid: 0x100000003
Mode: leader
Node count: 5
linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #

check the participant 1 with 4 letter word

linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
This ZooKeeper instance is not currently serving requests

We can see the participant1 logs filled with
2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running


Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Henry Robinson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henry Robinson updated ZOOKEEPER-1294:
--------------------------------------

    Attachment: ZOOKEEPER-1294-1.patch

Hi Kavita - 

Patch looks good to me, thanks! I like the test particularly. 

I've taken the liberty of moving the test to its own class. The reasoning is that starting the ensemble, then shutting down the servers, then starting them again is an avoidable time sink, and the tests take long enough to run already. Setting up the ensemble the way we want it in the first place saves about 10s on my machine. 

I also added an extra assertion, a comment and simplified building the server list. If you're happy with this, I'll go ahead and commit it. 

Thanks,
Henry
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "kavita sharma (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183078#comment-13183078 ] 

kavita sharma commented on ZOOKEEPER-1294:
------------------------------------------

Hi Cam,

Sorry for the delay.i have uploaded the patch please review it and 
let me know the modification if any?

Thanks
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>         Attachments: ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "kavita sharma (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178329#comment-13178329 ] 

kavita sharma commented on ZOOKEEPER-1294:
------------------------------------------

Hi,

Recently I also got the same issue. From the initial analysis, I felt the problem could be : 'The Leader ZK is leading with the support of the Observer.'

+Detailed Description:+
Scenario: I have Started 3 ZKs(1 Leader ,1 Observer and 1 Follower) and shutdown the Follower.

Leader has created 2 LearnerHandlers(1=Observer  and 1=Follower) and added to 'learners'. Now I have shutdown the Follower server, but the Leader is still in the LEADING state. When I have gone through the suporting logic, it is considering the 'learners' which contains Followers as well as Observers. As far as I know, the Leader shouldn't be in a Leading state with the support of Observer.

{noformat}
// lock on the followers when we use it.
syncedSet.add(self.getId());
synchronized (learners) {
     for (LearnerHandler f : learners) {
{noformat}

I feel it will be resolved, instead iterating using the learners, if use 'forwardingFollowers'. 
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185949#comment-13185949 ] 

Hadoop QA commented on ZOOKEEPER-1294:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510535/ZOOKEEPER-1294-3.patch
  against trunk revision 1227927.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/905//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/905//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/905//console

This message is automatically generated.
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294-2.patch, ZOOKEEPER-1294-3.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Henry Robinson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184762#comment-13184762 ] 

Henry Robinson commented on ZOOKEEPER-1294:
-------------------------------------------

These failures are legit; I'm looking into them now. 
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185915#comment-13185915 ] 

Hadoop QA commented on ZOOKEEPER-1294:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510532/ZOOKEEPER-1294-2.patch
  against trunk revision 1227927.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    -1 release audit.  The applied patch generated 25 release audit warnings (more than the trunk's current 24 warnings).

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/903//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/903//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/903//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/903//console

This message is automatically generated.
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294-2.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Henry Robinson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henry Robinson updated ZOOKEEPER-1294:
--------------------------------------

    Attachment: ZOOKEEPER-1294-2.patch

This patch fixes the deadlock issue, and passes all tests locally for me. I also cleaned up the syncedCount variable that was redundant. 
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294-2.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Henry Robinson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185454#comment-13185454 ] 

Henry Robinson commented on ZOOKEEPER-1294:
-------------------------------------------

So, after some investigation, I've found out what's happening with testNoLogBeforeLeaderEstablishment.

The patch changes the locking in Leader.java; now the lock around the sync-and-ping loop is on the forwardingFollowers set. The call to ping() with that lock held then takes the lock on the leader object. 

In the failing test runs, at the same time the ProposalRequestProcessor has locked the leader object in order to make a proposal in Leader.propose(). This then calls sendPacket, which (tries to) lock on forwardingFollowers. 

This is a classic deadlock - the threads try to take the same locks in a different order. Although there are a few options, I think actually the patch *shouldn't* be changing the set to forwardingFollowers, but should be using learners as before. This is because observers should be pinged as well, I think, so that they don't think they're dead. Instead, the code should explicitly test whether a learner is a PARTICIPANT as below:

{code}
synchronized (learners) {
                    for (LearnerHandler f : learners) {
                        if (f.synced() && f.getLearnerType() == LearnerType.PARTICIPANT) {
                            syncedCount++;
                            syncedSet.add(f.getSid());
                        }
                        f.ping();
                    }
                }
{code}

So only learners get added to the sync set, but everyone gets pinged. This seems to fix the problem with this test, at least, for me. Any thoughts?
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184589#comment-13184589 ] 

Hadoop QA commented on ZOOKEEPER-1294:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510273/ZOOKEEPER-1294-1.patch
  against trunk revision 1227927.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    -1 release audit.  The applied patch generated 25 release audit warnings (more than the trunk's current 24 warnings).

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/900//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/900//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/900//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/900//console

This message is automatically generated.
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185922#comment-13185922 ] 

Hadoop QA commented on ZOOKEEPER-1294:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510532/ZOOKEEPER-1294-2.patch
  against trunk revision 1227927.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    -1 release audit.  The applied patch generated 25 release audit warnings (more than the trunk's current 24 warnings).

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/904//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/904//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/904//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/904//console

This message is automatically generated.
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294-2.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "kavita sharma (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187483#comment-13187483 ] 

kavita sharma commented on ZOOKEEPER-1294:
------------------------------------------

Thanks Henry.
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294-2.patch, ZOOKEEPER-1294-3.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "kavita sharma (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kavita sharma reassigned ZOOKEEPER-1294:
----------------------------------------

    Assignee: kavita sharma
    
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "kavita sharma (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185546#comment-13185546 ] 

kavita sharma commented on ZOOKEEPER-1294:
------------------------------------------

H Henry , yes u r correct....!!!
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186156#comment-13186156 ] 

Hudson commented on ZOOKEEPER-1294:
-----------------------------------

Integrated in ZooKeeper-trunk #1427 (See [https://builds.apache.org/job/ZooKeeper-trunk/1427/])
    ZOOKEEPER-1294. One of the zookeeper server is not accepting any requests (Kavita Sharma via henryr)

henry : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1231389
Files : 
* /zookeeper/trunk/CHANGES.txt
* /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ClientBase.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ObserverLETest.java

                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294-2.patch, ZOOKEEPER-1294-3.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Henry Robinson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henry Robinson updated ZOOKEEPER-1294:
--------------------------------------

    Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
    
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Henry Robinson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henry Robinson updated ZOOKEEPER-1294:
--------------------------------------

    Attachment: ZOOKEEPER-1294-3.patch

Ugh, forgot to switch from Cloudera copyright to Apache copyright header in Eclipse, which caused the RAT warning.

This patch fixes that, and then we should be to go since all tests pass. Assuming this comes back clean again, I'll commit this. 
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294-2.patch, ZOOKEEPER-1294-3.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Camille Fournier (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182125#comment-13182125 ] 

Camille Fournier commented on ZOOKEEPER-1294:
---------------------------------------------

Glancing at the code, I think you might be right. Are you planning on writing a test and a fix for this or should I?
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184757#comment-13184757 ] 

Hadoop QA commented on ZOOKEEPER-1294:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510273/ZOOKEEPER-1294-1.patch
  against trunk revision 1227927.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 5 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    -1 release audit.  The applied patch generated 25 release audit warnings (more than the trunk's current 24 warnings).

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/901//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/901//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/901//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/901//console

This message is automatically generated.
                
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1294-1.patch, ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ZOOKEEPER-1294) One of the zookeeper server is not accepting any requests

Posted by "kavita sharma (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kavita sharma updated ZOOKEEPER-1294:
-------------------------------------

    Attachment: ZOOKEEPER-1294.patch
    
> One of the zookeeper server is not accepting any requests
> ---------------------------------------------------------
>
>                 Key: ZOOKEEPER-1294
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1294
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>         Environment: 3 Zookeeper + 3 Observer with SuSe-11
>            Reporter: amith
>            Assignee: kavita sharma
>         Attachments: ZOOKEEPER-1294.patch
>
>
> In zoo.cfg i have configured as
> server.1 = XX.XX.XX.XX:65175:65173
> server.2 = XX.XX.XX.XX:65185:65183
> server.3 = XX.XX.XX.XX:65195:65193
> server.4 = XX.XX.XX.XX:65205:65203:observer
> server.5 = XX.XX.XX.XX:65215:65213:observer
> server.6 = XX.XX.XX.XX:65225:65223:observer
> Like above I have configured 3 PARTICIPANTS and 3 OBSERVERS
> in the cluster of 6 zookeepers
> Steps to reproduce the defect
> 1. Start all the 3 participant zookeeper
> 2. Stop all the participant zookeeper
> 3. Start zookeeper 1(Participant)
> 4. Start zookeeper 2(Participant)
> 5. Start zookeeper 4(Observer)
> 6. Create a persistent node with external client and close it
> 7. Stop the zookeeper 1(Participant neo quorum is unstable)
> 8. Create a new client and try to find the node created b4 using exists api (will fail since quorum not statisfied)
> 9. Start the Zookeeper 1 (Participant stabilise the quorum)
> Now check the observer using 4 letter word (Server.4)
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65200
> Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT
> Clients:
>  /127.0.0.1:46370[0](queued=0,recved=1,sent=0)
> Latency min/avg/max: 0/0/0
> Received: 1
> Sent: 0
> Outstanding: 0
> Zxid: 0x100000003
> Mode: observer
> Node count: 5
> check the participant 2 with 4 letter word
> Latency min/avg/max: 22/48/83
> Received: 39
> Sent: 3
> Outstanding: 35
> Zxid: 0x100000003
> Mode: leader
> Node count: 5
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin #
> check the participant 1 with 4 letter word
> linux-216:/home/amith/CI/source/install/zookeeper/zookeeper2/bin # echo stat | netcat localhost 65170
> This ZooKeeper instance is not currently serving requests
> We can see the participant1 logs filled with
> 2011-11-08 15:49:51,360 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:65170:NIOServerCnxn@642] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
> Problem here is participent1 is not responding / accepting any requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira