You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Stephen Yuan Jiang (JIRA)" <ji...@apache.org> on 2015/10/02 08:34:26 UTC

[jira] [Assigned] (HBASE-14536) Balancer & SSH interfering with each other leading to unavailability

     [ https://issues.apache.org/jira/browse/HBASE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Yuan Jiang reassigned HBASE-14536:
------------------------------------------

    Assignee: Stephen Yuan Jiang

> Balancer & SSH interfering with each other leading to unavailability
> --------------------------------------------------------------------
>
>                 Key: HBASE-14536
>                 URL: https://issues.apache.org/jira/browse/HBASE-14536
>             Project: HBase
>          Issue Type: Bug
>          Components: master, Region Assignment
>    Affects Versions: 1.1.2
>            Reporter: Devaraj Das
>            Assignee: Stephen Yuan Jiang
>             Fix For: 1.1.4
>
>         Attachments: master-log.tgz
>
>
> Came across this in our cluster:
> 1. The meta was assigned to a server 10.0.0.149,16020,1443507203340
> {noformat}
> 2015-09-29 06:16:22,472 DEBUG [AM.ZK.Worker-pool2-t56] 
> master.RegionStates: Onlined 1588230740 on 
> 10.0.0.149,16020,1443507203340 {ENCODED => 1588230740, NAME => 
> 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
> {noformat}
> 2. The server dies at some point:
> {noformat}
> 2015-09-29 06:18:25,952 INFO  [main-EventThread] 
> zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, 
> processing expiration [10.0.0.149,16020,1443507203340]
> 2015-09-29 06:18:25,955 DEBUG [main-EventThread] master.AssignmentManager: based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=10.0.0.149,16020,1443507203340 server being checked: 
> 10.0.0.149,16020,1443507203340
> {noformat}
> 3. The balancer had computed a plan that contained a move for the meta:
> {noformat}
> 2015-09-29 06:18:26,833 INFO  [B.defaultRpcServer.handler=12,queue=0,port=16000] master.HMaster: 
> balance hri=hbase:meta,,1.1588230740, 
> src=10.0.0.149,16020,1443507203340, dest=10.0.0.205,16020,1443507257905
> {noformat}
> 4. The following ensues after this, leading to the meta remaining unassigned:
> {noformat}
> 2015-09-29 06:18:26,859 DEBUG [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Offline hbase:meta,,1.1588230740, no need to 
> unassign since it's on a dead server: 10.0.0.149,16020,1443507203340
> ......................
> 2015-09-29 06:18:26,899 INFO  [B.defaultRpcServer.handler=12,queue=0,port=16000] master.RegionStates: 
> Offlined 1588230740 from 10.0.0.149,16020,1443507203340
> .....................
> 2015-09-29 06:18:26,914 INFO  [B.defaultRpcServer.handler=12,queue=0,port=16000] 
> master.AssignmentManager: Skip assigning hbase:meta,,1.1588230740, it is 
> on a dead but not processed yet server: 10.0.0.149,16020,1443507203340
> ....................
> 2015-09-29 06:18:26,915 DEBUG [AM.ZK.Worker-pool2-t58] master.AssignmentManager: Znode hbase:meta,,1.1588230740 deleted, 
> state: {1588230740 state=OFFLINE, ts=1443507506914, 
> server=10.0.0.149,16020,1443507203340}
> ....................
> 2015-09-29 06:18:29,447 DEBUG [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] master.AssignmentManager: based on AM, current 
> region=hbase:meta,,1.1588230740 is on server=null server being checked: 
> 10.0.0.149,16020,1443507203340
> 2015-09-29 06:18:29,451 INFO  [MASTER_META_SERVER_OPERATIONS-
> 10.0.0.148:16000-2] handler.MetaServerShutdownHandler: META has been 
> assigned to otherwhere, skip assigning.
> 2015-09-29 06:18:29,452 DEBUG [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] 
> master.DeadServer: Finished processing 10.0.0.149,16020,1443507203340
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)