You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Allan Yang (JIRA)" <ji...@apache.org> on 2018/07/04 16:08:00 UTC

[jira] [Created] (HBASE-20846) Table's shared lock is not hold by sub-procedures after master restart

Allan Yang created HBASE-20846:
----------------------------------

             Summary: Table's shared lock is not hold by sub-procedures after master restart
                 Key: HBASE-20846
                 URL: https://issues.apache.org/jira/browse/HBASE-20846
             Project: HBase
          Issue Type: Bug
    Affects Versions: 2.1.0
            Reporter: Allan Yang
            Assignee: Allan Yang
             Fix For: 3.0.0, 2.1.0, 2.0.2


Found this one when investigating ModifyTableProcedure got stuck while there was a MoveRegionProcedure going on after master restart.
Though this issue can be solved by HBASE-20752. But I discovered something else.
Before a MoveRegionProcedure can execute, it will hold the table's shared lock. so,, when a UnassignProcedure was spwaned, it will not check the table's shared lock since it is sure that its parent(MoveRegionProcedure) has aquired the table's lock.
{code:java}
// If there is parent procedure, it would have already taken xlock, so no need to take
      // shared lock here. Otherwise, take shared lock.
      if (!procedure.hasParent()
          && waitTableQueueSharedLock(procedure, table) == null) {
          return true;
      }
{code}

But, it is not the case when Master was restarted. The child procedure(UnassignProcedure) will be executed first after restart. Though it has a parent(MoveRegionProcedure), but apprently the parent didn't hold the table's lock.
So, since it began to execute without hold the table's shared lock. A ModifyTableProcedure can aquire the table's exclusive lock and execute at the same time. Which is not possible if the master was not restarted.
This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, I wrote a simple UT to repo this case.

I think we don't have to check the parent for table's shared lock. It is a shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)