You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/04/09 04:54:00 UTC

[jira] [Created] (HBASE-20366) Procedure State != ProcedureState.RUNNABLE; IllegalArgumentException

stack created HBASE-20366:
-----------------------------

             Summary: Procedure State != ProcedureState.RUNNABLE; IllegalArgumentException
                 Key: HBASE-20366
                 URL: https://issues.apache.org/jira/browse/HBASE-20366
             Project: HBase
          Issue Type: Bug
          Components: amv2
            Reporter: stack


PE Worker dies and Region offlined because Procedure not runable when procedure goes to run it. It looks like this:

{code}
2018-04-07 19:58:50,589 INFO  [PEWorker-5] procedure.MasterProcedureScheduler: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184 checking lock on 187ee18fb3dd1a7ac1f9f2b667160729
2018-04-07 19:58:50,589 INFO  [PEWorker-14] procedure.MasterProcedureScheduler: pid=8302, state=RUNNABLE:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,\xEC0\x83\x96*\x86Qsh\xD82\x1E\xAB\x06$\x89,1523151456082.84e97ce42aeb78a2abaf8f17a278b735., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184 checking lock on 84e97ce42aeb78a2abaf8f17a278b735                                                                                                                                                                                        2018-04-07 19:58:50,591 WARN  [PEWorker-5] procedure2.ProcedureExecutor: Worker terminating UNNATURALLY null
java.lang.IllegalArgumentException: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184
  at org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134)
  at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1430)                                                                                                                                                                                                                  at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)
  at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
  at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
{code}

This killed my job because it offlined a region.

Narrative:

 * Balancer moves this region....
 * Move procedure does dispatch to unassign...
 * Suspiciously, the close comes in unannounced.. .its as though it a close from another procedure...

 2018-04-07 19:58:24,296 INFO  [PEWorker-9] assignment.RegionStateStore: pid=8305 updating hbase:meta row=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., regionState=CLOSED
 * Master is killed by monkey.
 * Recovery. Region is in CLOSED state.
 * We go to schedule the move region procedure again... Its state must have not been updated on master crash.

 2018-04-07 19:58:50,589 INFO  [PEWorker-5] procedure.MasterProcedureScheduler: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184 checking lock on 187ee18fb3dd1a7ac1f9f2b667160729

 * And then we get

 2018-04-07 19:58:50,591 WARN  [PEWorker-5] procedure2.ProcedureExecutor: Worker terminating UNNATURALLY null                                                                                                                                                                                                        java.lang.IllegalArgumentException: pid=8304, state=WAITING:MOVE_REGION_ASSIGN; MoveRegionProcedure hri=IntegrationTestBigLinkedList,p\xC3\x11\xB2,1523155040553.187ee18fb3dd1a7ac1f9f2b667160729., source=ve0534.halxg.cloudera.com,16020,1523153184521, destination=ve0542.halxg.cloudera.com,16020,1523155964184   at org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134)                                                                                                                                                                                                           at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1430)                                                                                                                                                                                                                  at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1221)                                                                                                                                                                                                               at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)                                                                                                                                                                                                                       at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
 





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)