You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/04/09 04:54:00 UTC
[jira] [Created] (HBASE-20366) Procedure State !=
ProcedureState.RUNNABLE; IllegalArgumentException
stack created HBASE-20366:
-----------------------------
Summary: Procedure State != ProcedureState.RUNNABLE; IllegalArgumentException
Key: HBASE-20366
URL: https://issues.apache.org/jira/browse/HBASE-20366
Project: HBase
Issue Type: Bug
Components: amv2
Reporter: stack
PE Worker dies and Region offlined because Procedure not runable when procedure goes to run it. It looks like this:
{code}
2018-04-07 19:58:50,589 INFO [PEWorker-5] procedure.MasterProcedureScheduler: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184 checking lock on 187ee18fb3dd1a7ac1f9f2b667160729
2018-04-07 19:58:50,589 INFO [PEWorker-14] procedure.MasterProcedureScheduler: pid=8302, state=RUNNABLE:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,\xEC0\x83\x96*\x86Qsh\xD82\x1E\xAB\x06$\x89,1523151456082.84e97ce42aeb78a2abaf8f17a278b735., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184 checking lock on 84e97ce42aeb78a2abaf8f17a278b735 2018-04-07 19:58:50,591 WARN [PEWorker-5] procedure2.ProcedureExecutor: Worker terminating UNNATURALLY null
java.lang.IllegalArgumentException: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184
at org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1430) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
{code}
This killed my job because it offlined a region.
Narrative:
* Balancer moves this region....
* Move procedure does dispatch to unassign...
* Suspiciously, the close comes in unannounced.. .its as though it a close from another procedure...
2018-04-07 19:58:24,296 INFO [PEWorker-9] assignment.RegionStateStore: pid=8305 updating hbase:meta row=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., regionState=CLOSED
* Master is killed by monkey.
* Recovery. Region is in CLOSED state.
* We go to schedule the move region procedure again... Its state must have not been updated on master crash.
2018-04-07 19:58:50,589 INFO [PEWorker-5] procedure.MasterProcedureScheduler: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184 checking lock on 187ee18fb3dd1a7ac1f9f2b667160729
* And then we get
2018-04-07 19:58:50,591 WARN [PEWorker-5] procedure2.ProcedureExecutor: Worker terminating UNNATURALLY null java.lang.IllegalArgumentException: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184 at org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1430) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)