You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "fulin wang (JIRA)" <ji...@apache.org> on 2011/07/22 12:08:59 UTC

[jira] [Created] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
--------------------------------------------------------------------------------------------------------------------

                 Key: HBASE-4124
                 URL: https://issues.apache.org/jira/browse/HBASE-4124
             Project: HBase
          Issue Type: Bug
          Components: master
            Reporter: fulin wang


ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Issue:
The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4124:
-------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thank you for the patch Gaojinchao.

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, HBASE-4124_TrunkV2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091499#comment-13091499 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

The code you looked at came from HBASE-4083
I don't think it is a bug.

Please provide the patch for TRUNK.

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_TrunkV2.patch

I am runing all the test cases. My new modification is more clear. 

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, HBASE-4124_TrunkV2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "fulin wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084780#comment-13084780 ] 

fulin wang commented on HBASE-4124:
-----------------------------------

Please gaojinchao fix the issues, Thanks.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070103#comment-13070103 ] 

stack commented on HBASE-4124:
------------------------------

hbase-3741 changes the behavior here in that now we notice if we are asked to open a region that is already open and we'll throw an exception back to the master.  I think the master will now reassign it elsewhere which is not what we want if its a RegionAlreadyInTransitionException.  This will make it so we'll not keep retrying but I think there is more to do.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091474#comment-13091474 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

@ Ted
I am making a patch for TRUNK. But I have some questions about TRUNK.
It seems a bug.
In function assign, when we get the return value "ALREADY_OPENED" .
should we update the meta table ?  or we do this on region server.

hmaster code:
  RegionOpeningState regionOpenState = serverManager.sendRegionOpen(plan
            .getDestination(), state.getRegion());
        if (regionOpenState == RegionOpeningState.ALREADY_OPENED) {

region server code: if we don't update the meta ,the client may access to the old server.

 HRegion onlineRegion = this.getFromOnlineRegions(region.getEncodedName());
    if (null != onlineRegion) {
      LOG.warn("Attempted open of " + region.getEncodedName()
          + " but already online on this server");
      return RegionOpeningState.ALREADY_OPENED;
    }

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090298#comment-13090298 ] 

ramkrishna.s.vasudevan commented on HBASE-4124:
-----------------------------------------------

@Gao
Correct me if am wrong.  I can understand the intention behind the logic. 
{code}
+          RegionTransitionData data = ZKAssign.getData(watcher, regionInfo.getEncodedName()); 
+          
+          //When zk node has been updated by a living server, we consider that this region server is handling it. 
+          //So we should skip it and process it in processRegionsInTransition.
+          if (data != null && data.getServerName() != null &&
+            serverManager.isServerOnline(data.getServerName())){
+              LOG.info("The region " + regionInfo.getEncodedName() +
+                "is processing by " + data.getServerName());
+            continue;
+          }
{code}
But if as part of rebuildUserRegions() the master finds a server to be dead and adds those RS to dead servers and also u said the master was killed.
How come we have a dead RS if we dont kill the RS and if the master is also killed how can the regions be assigned to some other RS (how can the state change in ZK for that region node).
May be am not understanding something.  If you can explain this it will help me in Timeoutmonitor. 
Rest looks fine.  

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093173#comment-13093173 ] 

Hudson commented on HBASE-4124:
-------------------------------

Integrated in HBase-TRUNK #2158 (See [https://builds.apache.org/job/HBase-TRUNK/2158/])
    HBASE-4124 ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, HBASE-4124_TrunkV2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091387#comment-13091387 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

Patch integrated to 0.90 branch after wrapping long lines.

Thanks for the patch Jinchao.

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "fulin wang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070342#comment-13070342 ] 

fulin wang commented on HBASE-4124:
-----------------------------------

I can't find where does it call getRegionsInTransitionInRS().add()? So I do not understand why add this function.
About 'already online on this server' of error, I want that the region should be closed or reassinged. I am trying to make a patch.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091329#comment-13091329 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

@Jinchao:
Can you prepare patch for TRUNK as well ?

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_Branch90V2.patch

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090330#comment-13090330 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

All tests passed with patch v3 for 0.90 branch.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090931#comment-13090931 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

+1 on patch v4.
Minor comment:
A few lines such as the following are longer than 80 characters:
{code}
+            (null == data.getServerName() || !serverManager.isServerOnline(data.getServerName()))) {
{code}

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092586#comment-13092586 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

That's right.
Waiting for decision on refactoring before committing.

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088147#comment-13088147 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

sorry.step 3: startup master again .

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_Branch90V2.patch

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Fix Version/s: 0.90.5

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088173#comment-13088173 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

I have added a test case for opening a region.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090141#comment-13090141 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

@Ted
Does it need a patch for Trunk? 
There is a big change, I need some time to study it.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089261#comment-13089261 ] 

ramkrishna.s.vasudevan commented on HBASE-4124:
-----------------------------------------------

@Gao
{bq}
step 3: startup master again .

As per the scenario you have described when the master restarted the RS has it opened the region? I think the scenario here is RS is also dead.
If so the assignment manager will try assigning it to a new RS.  Do you think any problem here? 
If the RS is alive then the znode status will be OPENED state and the processRIT will take care of clearing the node as it is already opened.  Could be be more clear on the state of RS after you killed the master and also on the state of znode in zookeeper for that region.


> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090698#comment-13090698 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

@ram
How come we have a dead RS if we dont kill the RS

gao: If you stop the cluster, The meta will handle the server information.

if the master is also killed how can the regions be assigned to some other RS 

gao: When master startup, it collects the regions on a same region server and 
     call sendRegionOpen(destination, regions).
     If the region is relatively large number, when region server opens the reigons needs a long time.
     when master crash, the new master may reopen the regions on another region server.
     

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_Branch90V4.patch

According to review, modified the comments.
Thanks for Ted's careful review.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4124:
--------------------------

    Summary: ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.  (was: ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.)

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_Branch90V1_trial.patch

I try to make a patch and fix this issue.
But I only run the UT test. Please review it firstly and give me some suggestion. I will test it tomorrow. Thanks.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4124:
--------------------------

    Attachment: 4124-trunk.v2

This patch added null check for sn.

According to http://download.oracle.com/javase/1,5.0/docs/api/java/util/concurrent/ConcurrentHashMap.html, null cannot be used as key.

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: 4124-trunk.v2, HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090752#comment-13090752 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

+1 on patch version 3.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092187#comment-13092187 ] 

stack commented on HBASE-4124:
------------------------------

Patch is basically good but a bunch of code is repeated.  Can we refactor the repeating code out to method and call that?

For example:

{code}
+        if (isOnDeadServer(regionInfo, deadServers)
+            && (null == data.getOrigin() || !serverManager.isServerOnline(data
+                .getOrigin()))) {
{code}

This is repeated three times.  One of the repeats does not check for null and it seems like it should.

Also, I'd write the above as to make it more readable (just saying...)

{code}
  if (isOnDeadServer(regionInfo, deadServers) &&
    (data.getOrigin() == null ||
      !serverManager.isServerOnline(data.getOrigin()))) {
{code}

Otherwise patch makes sense.  Nice one Gao

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment:     (was: HBASE-4124_Branch90V2.patch)

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment:     (was: HBASE-4124_Branch90V2.patch)

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092583#comment-13092583 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

@Ted thanks for your work. 
sn has checked about null above statement.

        if (sn == null) {
          LOG.warn("Region in transition " + regionInfo.getEncodedName() +
            " references a null server; letting RIT timeout so will be " +
            "assigned elsewhere");
          break;
        }

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: 4124-trunk.v2, HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090677#comment-13090677 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

@Ted 
I have run all the tests. Thanks for your work.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Status: Patch Available  (was: Open)

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092618#comment-13092618 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

All test cases passed. Thanks.


> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, HBASE-4124_TrunkV2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089890#comment-13089890 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

RS isn't dead. I can reproduce and verify it.

ZK status has changed before adding to RIT set. You can look the function processDeadServers.
That is the reason why a region is assigned twice. 

        // If region was in transition (was in zk) force it offline for reassign
        try {
          //Process with existing RS shutdown code  
          boolean assign =
            ServerShutdownHandler.processDeadRegion(regionInfo, result, this,
              this.catalogTracker);
          if (assign) {
            ZKAssign.createOrForceNodeOffline(watcher, regionInfo,
              master.getServerName()); 
          }



> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092508#comment-13092508 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

Ran through all tests based on 4124-trunk.v2, there was only one test failure:
{code}
Failed tests:   testWritesWhileGetting(org.apache.hadoop.hbase.regionserver.TestHRegion): expected:<\x00\x00\x00\xE0> but was:<\x00\x00\x00\xDE>
{code}
I think the above should be due to HBASE-3855

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: 4124-trunk.v2, HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_Branch90V2.patch

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao reassigned HBASE-4124:
---------------------------------

    Assignee: gaojinchao

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088146#comment-13088146 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

I have finished the test. I discribe the scene:
step 1: startup cluster 
step 2: abort the master when finish call "sendRegionOpen(destination, regions)"
step 3: startup cluster again.

above steps will reproduce the issue. 
when master is failover. the meta records the dead server,but the region is processing for a living region server.


> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092489#comment-13092489 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

bq. Can we refactor the repeating code out to method and call that?
This would produce a difference between the 0.90 branch and TRUNK, right ?
Should we defer this refactoring to another JIRA ?

I agree with the comment about null pointer check.

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_TrunkV1.patch

I have made a patch. I found two test case(TestAdmin and RollLoging) can't pass. I use the raw trunk as well

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "fulin wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

fulin wang updated HBASE-4124:
------------------------------

    Attachment: log.txt

The error log.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>         Attachments: log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090250#comment-13090250 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

Once patch v3 receives +1 vote, a patch for TRUNK should be made.
Thanks for the effort.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

gaojinchao updated HBASE-4124:
------------------------------

    Attachment: HBASE-4124_Branch90V3.patch

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4124:
--------------------------

    Attachment:     (was: 4124-trunk.v2)

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089987#comment-13089987 ] 

Ted Yu commented on HBASE-4124:
-------------------------------

HBASE-4124_Branch90V2.patch makes sense.
Please correct grammar in javadocs.

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090759#comment-13090759 ] 

ramkrishna.s.vasudevan commented on HBASE-4124:
-----------------------------------------------

{bq}.sorry.step 3: startup master again .
This statement confused me a bit.
Thanks for your explanation. :)

> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092183#comment-13092183 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

Thanks for Ted. 
Test case has passed.

-------------------------------------------------------
 T E S T S
-------------------------------------------------------

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hbase.client.TestAdmin


Tests run: 28, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 358.001 sec

Results :

Tests run: 28, Failures: 0, Errors: 0, Skipped: 0


> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4124) ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.

Posted by "gaojinchao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092191#comment-13092191 ] 

gaojinchao commented on HBASE-4124:
-----------------------------------

@Stack. Thanks for your review.
I'll be ready to catch a plane, I will modify according to your opinion back to Shenzhen.

> ZK restarted while a region is being assigned, new active HM re-assigns it but the RS warns 'already online on this server'.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, HBASE-4124_Branch90V4.patch, HBASE-4124_TrunkV1.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira