You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2015/10/02 00:57:26 UTC
[jira] [Updated] (HBASE-14536) Balancer & SSH interfering with each
other leading to unavailability
[ https://issues.apache.org/jira/browse/HBASE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Devaraj Das updated HBASE-14536:
--------------------------------
Affects Version/s: 1.1.2
> Balancer & SSH interfering with each other leading to unavailability
> --------------------------------------------------------------------
>
> Key: HBASE-14536
> URL: https://issues.apache.org/jira/browse/HBASE-14536
> Project: HBase
> Issue Type: Bug
> Components: master, Region Assignment
> Affects Versions: 1.1.2
> Reporter: Devaraj Das
> Fix For: 1.1.4
>
>
> Came across this in our cluster:
> 1. The meta was assigned to a server 10.0.0.149,16020,1443507203340
> {noformat}
> 2015-09-29 06:16:22,472 DEBUG [AM.ZK.Worker-pool2-t56]
> master.RegionStates: Onlined 1588230740 on
> 10.0.0.149,16020,1443507203340 {ENCODED => 1588230740, NAME =>
> 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''}
> {noformat}
> 2. The server dies at some point:
> {noformat}
> 2015-09-29 06:18:25,952 INFO [main-EventThread]
> zookeeper.RegionServerTracker: RegionServer ephemeral node deleted,
> processing expiration [10.0.0.149,16020,1443507203340]
> 2015-09-29 06:18:25,955 DEBUG [main-EventThread] master.AssignmentManager: based on AM, current
> region=hbase:meta,,1.1588230740 is on server=10.0.0.149,16020,1443507203340 server being checked:
> 10.0.0.149,16020,1443507203340
> {noformat}
> 3. The balancer had computed a plan that contained a move for the meta:
> {noformat}
> 2015-09-29 06:18:26,833 INFO [B.defaultRpcServer.handler=12,queue=0,port=16000] master.HMaster:
> balance hri=hbase:meta,,1.1588230740,
> src=10.0.0.149,16020,1443507203340, dest=10.0.0.205,16020,1443507257905
> {noformat}
> 4. The following ensues after this, leading to the meta remaining unassigned:
> {noformat}
> 2015-09-29 06:18:26,859 DEBUG [B.defaultRpcServer.handler=12,queue=0,port=16000]
> master.AssignmentManager: Offline hbase:meta,,1.1588230740, no need to
> unassign since it's on a dead server: 10.0.0.149,16020,1443507203340
> ......................
> 2015-09-29 06:18:26,899 INFO [B.defaultRpcServer.handler=12,queue=0,port=16000] master.RegionStates:
> Offlined 1588230740 from 10.0.0.149,16020,1443507203340
> .....................
> 2015-09-29 06:18:26,914 INFO [B.defaultRpcServer.handler=12,queue=0,port=16000]
> master.AssignmentManager: Skip assigning hbase:meta,,1.1588230740, it is
> on a dead but not processed yet server: 10.0.0.149,16020,1443507203340
> ....................
> 2015-09-29 06:18:26,915 DEBUG [AM.ZK.Worker-pool2-t58] master.AssignmentManager: Znode hbase:meta,,1.1588230740 deleted,
> state: {1588230740 state=OFFLINE, ts=1443507506914,
> server=10.0.0.149,16020,1443507203340}
> ....................
> 2015-09-29 06:18:29,447 DEBUG [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2] master.AssignmentManager: based on AM, current
> region=hbase:meta,,1.1588230740 is on server=null server being checked:
> 10.0.0.149,16020,1443507203340
> 2015-09-29 06:18:29,451 INFO [MASTER_META_SERVER_OPERATIONS-
> 10.0.0.148:16000-2] handler.MetaServerShutdownHandler: META has been
> assigned to otherwhere, skip assigning.
> 2015-09-29 06:18:29,452 DEBUG [MASTER_META_SERVER_OPERATIONS-10.0.0.148:16000-2]
> master.DeadServer: Finished processing 10.0.0.149,16020,1443507203340
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)