You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2019/02/13 02:52:00 UTC
[jira] [Commented] (HBASE-20657) Retrying RPC call for ModifyTableProcedure may get stuck

    [ https://issues.apache.org/jira/browse/HBASE-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766695#comment-16766695 ] 

Hadoop QA commented on HBASE-20657:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} | {color:red} HBASE-20657 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-20657 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12933276/HBASE-20657-4-master.patch |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/15946/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Retrying RPC call for ModifyTableProcedure may get stuck
> --------------------------------------------------------
>
>                 Key: HBASE-20657
>                 URL: https://issues.apache.org/jira/browse/HBASE-20657
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, proc-v2
>    Affects Versions: 2.0.0
>            Reporter: Sergey Soldatov
>            Assignee: stack
>            Priority: Critical
>             Fix For: 3.0.0, 2.3.0
>
>         Attachments: HBASE-20657-1-branch-2.patch, HBASE-20657-2-branch-2.patch, HBASE-20657-3-branch-2.patch, HBASE-20657-4-master.patch, HBASE-20657-testcase-branch2.patch
>
>
> Env: 2 masters, 1 RS. 
> Steps to reproduce: Active master is killed while ModifyTableProcedure is executed. 
> If the table has enough regions it may come that when the secondary master get active some of the regions may be closed, so once client retries the call to the new active master, a new ModifyTableProcedure is created and get stuck during MODIFY_TABLE_REOPEN_ALL_REGIONS state handling. That happens because:
> 1. When we are retrying from client side, we call modifyTableAsync which create a procedure with a new nonce key:
> {noformat}
>          ModifyTableRequest request = RequestConverter.buildModifyTableRequest(
>             td.getTableName(), td, ng.getNonceGroup(), ng.newNonce());
> {noformat}
>  So on the server side, it's considered as a new procedure and starts executing immediately.
> 2. When we are processing  MODIFY_TABLE_REOPEN_ALL_REGIONS we create MoveRegionProcedure for each region, but it checks whether the region is online (and it's not), so it fails immediately, forcing the procedure to restart.
> [~ankit@apache.org] saw a similar case when two concurrent ModifyTable procedures were running and got stuck in the similar way. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)