You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2013/08/01 02:19:48 UTC

[jira] [Commented] (HBASE-9085) Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.

    [ https://issues.apache.org/jira/browse/HBASE-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725892#comment-13725892 ] 

Enis Soztutar commented on HBASE-9085:
--------------------------------------

Thanks for working on this. This area of the code base was not tested properly (there should be some comments saying so). It seems that your patch would fix the problem. Did you test it in your setup? If so I'll commit this one. 
                
> Integration Tests fails because of bug in teardown phase where the cluster state is not being restored properly.
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9085
>                 URL: https://issues.apache.org/jira/browse/HBASE-9085
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 0.95.0, 0.94.9, 0.94.10
>            Reporter: gautam
>            Assignee: gautam
>             Fix For: 0.98.0, 0.95.2, 0.94.11
>
>         Attachments: HBASE-9085.patch._0.94, HBASE-9085.patch._0.95_or_trunk
>
>
> I was running the following test over a Distributed Cluster:
> bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestDataIngestSlowDeterministic
> The IntegrationTestingUtility.restoreCluster() is called in the teardown phase of the test.
> For a distributed cluster, it ends up calling DistributedHBaseCluster.restoreClusterStatus, which does the task 
> of restoring the cluster back to original state.
> The restore steps done here, does not solve one specific case:
> When the initial HBase Master is currently down, and the current HBase Master is different from the initial one.
> You get into this flow:
>     //check whether current master has changed
>     if (!ServerName.isSameHostnameAndPort(initial.getMaster(), current.getMaster())) {
> 	.............
>     }
> In the above code path, the current backup masters are stopped, and the current active master is also stopped.
> At this point, for the aforementioned usecase, none of the Hbase Masters would be available, hence the subsequent
> attempts to do any operation over the cluster would fail, resulting in Test Failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira