You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "HBase Review Board (JIRA)" <ji...@apache.org> on 2010/07/21 02:54:52 UTC

[jira] Commented: (HBASE-2858) TestReplication.queueFailover fails half the time

    [ https://issues.apache.org/jira/browse/HBASE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890518#action_12890518 ] 

HBase Review Board commented on HBASE-2858:
-------------------------------------------

Message from: "Jean-Daniel Cryans" <jd...@apache.org>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/349/
-----------------------------------------------------------

Review request for hbase.


Summary
-------

This patch fixes the ZKW.listZNodes issue and clears up a bit the path situation in ReplicationSource by removing a lock and adding wits to figure where the log is moved. The test now passes 100% of the time for me (up from 50%).

There's one open issues as outlined by the two TODOS, what happens if a log is missing from HDFS? When the queue is recovered, it could mean that HDFS was cleared but not ZK, but during normal operations it would point to a bug? Report and continue?


This addresses bug HBASE-2858.
    http://issues.apache.org/jira/browse/HBASE-2858


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e6b365e 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java a037aae 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 6b9dcb5 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java 2e13a0a 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java e8dd268 
  src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java 163671f 
  src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java bb09bc3 

Diff: http://review.hbase.org/r/349/diff


Testing
-------

Unit testing.


Thanks,

Jean-Daniel




> TestReplication.queueFailover fails half the time
> -------------------------------------------------
>
>                 Key: HBASE-2858
>                 URL: https://issues.apache.org/jira/browse/HBASE-2858
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.90.0
>
>
> TestReplication.queueFailover fails 50% of the time, it's because ZooKeeperWrapper.listZnodes (introduced in HBASE-2694 and missed by HBASE-2735) doesn't use the Watcher it's passed so sometimes ReplicationSource misses hlogs to replicate for the region server we kill. Also it uncovered an issue (while I was fixing the first one) that RepSource ignores log files too quickly when the master is a bit too slow to split logs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.