You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "HBase Review Board (JIRA)" <ji...@apache.org> on 2010/07/21 02:54:52 UTC
[jira] Commented: (HBASE-2858) TestReplication.queueFailover fails
half the time
[ https://issues.apache.org/jira/browse/HBASE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890518#action_12890518 ]
HBase Review Board commented on HBASE-2858:
-------------------------------------------
Message from: "Jean-Daniel Cryans" <jd...@apache.org>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/349/
-----------------------------------------------------------
Review request for hbase.
Summary
-------
This patch fixes the ZKW.listZNodes issue and clears up a bit the path situation in ReplicationSource by removing a lock and adding wits to figure where the log is moved. The test now passes 100% of the time for me (up from 50%).
There's one open issues as outlined by the two TODOS, what happens if a log is missing from HDFS? When the queue is recovered, it could mean that HDFS was cleared but not ZK, but during normal operations it would point to a bug? Report and continue?
This addresses bug HBASE-2858.
http://issues.apache.org/jira/browse/HBASE-2858
Diffs
-----
src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java e6b365e
src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java a037aae
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 6b9dcb5
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java 2e13a0a
src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java e8dd268
src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java 163671f
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java bb09bc3
Diff: http://review.hbase.org/r/349/diff
Testing
-------
Unit testing.
Thanks,
Jean-Daniel
> TestReplication.queueFailover fails half the time
> -------------------------------------------------
>
> Key: HBASE-2858
> URL: https://issues.apache.org/jira/browse/HBASE-2858
> Project: HBase
> Issue Type: Bug
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.90.0
>
>
> TestReplication.queueFailover fails 50% of the time, it's because ZooKeeperWrapper.listZnodes (introduced in HBASE-2694 and missed by HBASE-2735) doesn't use the Watcher it's passed so sometimes ReplicationSource misses hlogs to replicate for the region server we kill. Also it uncovered an issue (while I was fixing the first one) that RepSource ignores log files too quickly when the master is a bit too slow to split logs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.