You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2013/01/05 01:44:47 UTC

[jira] [Commented] (HBASE-7440) ReplicationZookeeper#addPeer is racy

    [ https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13544440#comment-13544440 ] 

Hudson commented on HBASE-7440:
-------------------------------

Integrated in HBase-0.94-security-on-Hadoop-23 #10 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/10/])
    HBASE-7440 ReplicationZookeeper#addPeer is racy (Himanshu) (Revision 1426704)

     Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/CHANGES.txt
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java

                
> ReplicationZookeeper#addPeer is racy
> ------------------------------------
>
>                 Key: HBASE-7440
>                 URL: https://issues.apache.org/jira/browse/HBASE-7440
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.3
>            Reporter: Himanshu Vashishtha
>            Assignee: Himanshu Vashishtha
>             Fix For: 0.96.0, 0.94.4
>
>         Attachments: HBASE-7440-trunk-v0.patch, HBASE-7440-trunk-v1.patch, HBASE-7440-v0.patch, HBASE-7440-v1.patch, HBASE-7440-v2.patch
>
>
> While adding a peer, ReplicationZK does the znodes creation in three transactions. Create :
> a) peers znode
> b) peerId specific znode, and
> c) peerState znode
> There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it happens that while adding a peer, the control flows to getPeer() and step c) has not been processed, it may results in a state where the peer will not be added. This happens while running TestMasterReplication#testCyclicReplication().
> {code}
> 2012-12-26 07:36:35,187 INFO  [RegionServer:0;p0120.XXXXX,38423,1356536179470-EventThread] zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state already exists and this is not a retry
> 2012-12-26 07:36:35,188 ERROR [RegionServer:0;p0120.XXXXX,38423,1356536179470-EventThread] regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a new peer
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /2/replication/peers/1/peer-state
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
> 	at org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
> 	at org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
> 	at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
> 	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
> 	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2012-12-26 07:36:35,188 DEBUG [RegionServer:0;p0120.XXXXX,55742,1356536171947-EventThread] zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira