You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Mikhail Bautin (Created) (JIRA)" <ji...@apache.org> on 2012/03/12 21:28:38 UTC

[jira] [Created] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover

[89-fb] Region server can get stuck getMaster on master failover
----------------------------------------------------------------

                 Key: HBASE-5566
                 URL: https://issues.apache.org/jira/browse/HBASE-5566
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.89-fb
            Reporter: Mikhail Bautin
            Assignee: Mikhail Bautin


Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5566) [89-fb] Region server can get stuck in getMaster on master failover

Posted by "Mikhail Bautin (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232762#comment-13232762 ] 

Mikhail Bautin commented on HBASE-5566:
---------------------------------------

Code reviewed at: https://reviews.facebook.net/D2283
                
> [89-fb] Region server can get stuck in getMaster on master failover
> -------------------------------------------------------------------
>
>                 Key: HBASE-5566
>                 URL: https://issues.apache.org/jira/browse/HBASE-5566
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89-fb
>            Reporter: Prakash Khemani
>            Assignee: Mikhail Bautin
>
> This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck in getMaster on master failover

Posted by "Mikhail Bautin (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikhail Bautin updated HBASE-5566:
----------------------------------

    Summary: [89-fb] Region server can get stuck in getMaster on master failover  (was: [89-fb] Region server can get stuck getMaster on master failover)
    
> [89-fb] Region server can get stuck in getMaster on master failover
> -------------------------------------------------------------------
>
>                 Key: HBASE-5566
>                 URL: https://issues.apache.org/jira/browse/HBASE-5566
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89-fb
>            Reporter: Prakash Khemani
>            Assignee: Mikhail Bautin
>
> This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-5566) [89-fb] Region server can get stuck in getMaster on master failover

Posted by "Mikhail Bautin (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikhail Bautin resolved HBASE-5566.
-----------------------------------

    Resolution: Fixed

Patch committed internally, will be synced to 0.89-fb very soon.
                
> [89-fb] Region server can get stuck in getMaster on master failover
> -------------------------------------------------------------------
>
>                 Key: HBASE-5566
>                 URL: https://issues.apache.org/jira/browse/HBASE-5566
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89-fb
>            Reporter: Prakash Khemani
>            Assignee: Mikhail Bautin
>
> This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover

Posted by "Mikhail Bautin (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikhail Bautin updated HBASE-5566:
----------------------------------

    Reporter: Prakash Khemani  (was: Mikhail Bautin)
    
> [89-fb] Region server can get stuck getMaster on master failover
> ----------------------------------------------------------------
>
>                 Key: HBASE-5566
>                 URL: https://issues.apache.org/jira/browse/HBASE-5566
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89-fb
>            Reporter: Prakash Khemani
>            Assignee: Mikhail Bautin
>
> Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover

Posted by "Mikhail Bautin (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mikhail Bautin updated HBASE-5566:
----------------------------------

    Description: 
This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.


  was:
Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.


    
> [89-fb] Region server can get stuck getMaster on master failover
> ----------------------------------------------------------------
>
>                 Key: HBASE-5566
>                 URL: https://issues.apache.org/jira/browse/HBASE-5566
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89-fb
>            Reporter: Prakash Khemani
>            Assignee: Mikhail Bautin
>
> This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira