You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2018/12/20 21:52:00 UTC

[jira] [Updated] (HBASE-21624) master startup should not wait (or die) on assigning meta replicas

     [ https://issues.apache.org/jira/browse/HBASE-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HBASE-21624:
-------------------------------------
    Summary: master startup should not wait (or die) on assigning meta replicas  (was: master startup should not wait on assigning meta replicas)

> master startup should not wait (or die) on assigning meta replicas
> ------------------------------------------------------------------
>
>                 Key: HBASE-21624
>                 URL: https://issues.apache.org/jira/browse/HBASE-21624
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Major
>
> Due to some other bug, a meta replica is stuck in transition forever. 
> Master is running fine without it, however the initializer thread hasn't finished initialization for ~19 hours now and is stuck in the below state.
> Doesn't seem to be necessary to wait for them - could just be fire-and-forget, normal region handling should handle it after that.
> {noformat}
> Thread 118 (master/...:17000:becomeActiveMaster):
>   State: TIMED_WAITING
>   Blocked count: 281
>   Waited count: 67059
>   Stack:
>     java.lang.Thread.sleep(Native Method)
>     org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:209)
>     org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:192)
>     org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:151)
>     org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:140)
>     org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:133)
>     org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:569)
>     org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84)
>     org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146)
>     org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2342)
> {noformat}
> Additionally and semi related, if the meta-hosting server dies during replica assignment, master also immediately dies, which is unnecessary.
> {noformat}
> 2018-12-14 21:00:55,331 ERROR [master/...:17000:becomeActiveMaster] master.HMaster: Failed to become active master
> org.apache.hadoop.hbase.HBaseIOException: rit=OFFLINE, location=null, table=hbase:meta, region=534574363 is currently in transition
>                 at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:545)
>                 at org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:563)
>                 at org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84)
>                 at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146)
>                 at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2342)
>                 at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:591)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)