You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (JIRA)" <ji...@apache.org> on 2018/03/08 10:08:00 UTC

[jira] [Commented] (HBASE-20160) TestRestartCluster.testRetainAssignmentOnRestart uses the wrong condition to decide whether the assignment is finished

    [ https://issues.apache.org/jira/browse/HBASE-20160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16391016#comment-16391016 ] 

Duo Zhang commented on HBASE-20160:
-----------------------------------

Just use UTIL.waitTableAvailable to decide whether the assignment is finished. In the method we will scan the given table so if we can success then we can make sure that the region is online.

And also, when shutting down cluster, stop master first so make sure that the ServerCrashProcedure not assign the region to another RS since the old server is dead.

[~stack] FYI.

> TestRestartCluster.testRetainAssignmentOnRestart uses the wrong condition to decide whether the assignment is finished
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20160
>                 URL: https://issues.apache.org/jira/browse/HBASE-20160
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: HBASE-20160.patch
>
>
> {code}
>     // Wait till master is initialized and all regions are assigned
>     RegionStates regionStates = master.getAssignmentManager().getRegionStates();
>     int expectedRegions = regionToRegionServerMap.size() + 1;
>     while (!master.isInitialized()
>         || regionStates.getRegionAssignments().size() != expectedRegions) {
>       Threads.sleep(100);
>     }
> {code}
> Actually this does not mean the assignment is finished. In AMv2, we will load the region state from meta when restarting, so the regionStates.getRegionAssignments will reach the expected count soon. But this is just the old location. After that, we will continue to execute the ServerCrashProcedure to deal with the reassignment. That's why sometimes we may fail with
> {noformat}
> java.lang.AssertionError: Values should be different. Actual: 1520478964169
> 	at org.apache.hadoop.hbase.master.TestRestartCluster.testRetainAssignmentOnRestart(TestRestartCluster.java:215)
> {noformat}
> We just read the old location from meta since the ServerCrashProcedure has not been finished yet, but we want to confirm that the region is on the same host and port but a new RS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)