You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Michi Mutsuzaki (JIRA)" <ji...@apache.org> on 2014/03/14 05:02:42 UTC
[jira] [Commented] (ZOOKEEPER-1894) ObserverTest.testObserver fails
consistently
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934542#comment-13934542 ]
Michi Mutsuzaki commented on ZOOKEEPER-1894:
--------------------------------------------
It looks like the observer is sending a lot of messages to itself during the leader election.
{noformat}
diff --git src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
index 9876c3d..1e28209 100644
--- src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
+++ src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
@@ -248,6 +248,10 @@ public class FastLeaderElection implements Election {
long relectionEpoch = response.buffer.getLong();
long rpeerepoch;
+ LOG.info("Received a message sid={} state={} " +
+ "rleader={} rzxid={} relectionEpoch={}",
+ response.sid, rstate, rleader,
+ rzxid, relectionEpoch);
if(!backCompatibility28){
rpeerepoch = response.buffer.getLong();
} else {
{noformat}
{noformat}
[junit] 2014-03-13 20:49:49,771 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@251] - Received a message sid=3 state=3 rleader=2 rzxid=0 relectionEpoch=1
[junit] 2014-03-13 20:49:49,772 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@251] - Received a message sid=3 state=3 rleader=2 rzxid=0 relectionEpoch=1
[junit] 2014-03-13 20:49:49,772 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@251] - Received a message sid=3 state=3 rleader=2 rzxid=0 relectionEpoch=1
[junit] 2014-03-13 20:49:49,772 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@251] - Received a message sid=3 state=3 rleader=2 rzxid=0 relectionEpoch=1
[junit] 2014-03-13 20:49:49,772 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@251] - Received a message sid=3 state=3 rleader=2 rzxid=0 relectionEpoch=1
[junit] 2014-03-13 20:49:49,773 [myid:3] - INFO [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@251] - Received a message sid=3 state=3 rleader=2 rzxid=0 relectionEpoch=1
...
{noformat}
> ObserverTest.testObserver fails consistently
> --------------------------------------------
>
> Key: ZOOKEEPER-1894
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1894
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.5.0
> Environment: ubuntu 13.10
> Server environment:java.version=1.7.0_51
> Server environment:java.vendor=Oracle Corporation
> Reporter: Michi Mutsuzaki
> Fix For: 3.5.0
>
> Attachments: TEST-org.apache.zookeeper.test.ObserverTest.txt.gz
>
>
> ObserverTest.testObserver fails consistently on my box. It looks like the observer (myid:3) calls QuorumPeer.getQuorumVerifier() in a tight loop, and the leader (myid:2) is not getting enough CPU time to synchronize with the follower and the observer. The test passes if I increase ClientBase.CONNECTION_TIMEOUT from 30 seconds to 120 seconds. I'll attach a log file.
--
This message was sent by Atlassian JIRA
(v6.2#6252)