You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jingyun Tian (JIRA)" <ji...@apache.org> on 2019/02/21 07:51:00 UTC

[jira] [Commented] (HBASE-21934) SplitWALProcedure get stuck during ITBLL

    [ https://issues.apache.org/jira/browse/HBASE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773780#comment-16773780 ] 

Jingyun Tian commented on HBASE-21934:
--------------------------------------

After checking the code and logs, I found the problem is that once we dispatch there operations to region server. The Set which stores these operations will be set to null.

{code}
public synchronized void dispatch() {
 if (operations != null) {
 remoteDispatch(getKey(), operations);
 this.operations = null;
 }
}

{code}

Then when the target region server crash and calls abortOperationsInQueue, it will only fail these operations that not sent yet.

{code}

public synchronized void abortOperationsInQueue() {
 if (operations != null) {
 abortPendingOperations(getKey(), operations);
 this.operations = null;
 }
}

{code}

I'll add a test for this problem later. And to solve this problem, one way I think is remove operation only when it finished. Or it should go through all procedures to find the crash region server related ones.

> SplitWALProcedure get stuck during ITBLL
> ----------------------------------------
>
>                 Key: HBASE-21934
>                 URL: https://issues.apache.org/jira/browse/HBASE-21934
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Jingyun Tian
>            Assignee: Jingyun Tian
>            Priority: Major
>
> I encounter the problem that when master assign a splitWALRemoteProcedure to a region server. The log of this region server says it failed to recover the lease of this file. Then this region server is killed by chaosMonkey. As the result, this procedure is not timeout and hang there forever.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)