You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bo Cui (Jira)" <ji...@apache.org> on 2019/12/19 09:59:00 UTC

[jira] [Comment Edited] (HBASE-20616) TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state

    [ https://issues.apache.org/jira/browse/HBASE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999901#comment-16999901 ] 

Bo Cui edited comment on HBASE-20616 at 12/19/19 9:58 AM:
----------------------------------------------------------

[~brfrn169] [~hadoopqa]  HI
{code:java}
...
if (!preserveSplits) {
  // if we are not preserving splits, generate a new single region
  regions = Arrays.asList(ModifyRegionUtils.createHRegionInfos(hTableDescriptor, null));
} else {
  regions = recreateRegionInfo(regions);
}
...{code}
In the old code, the regions is recreated, and I do not think there will be duplicate regions...and the result is that the table has invalid region in hdfs...

so test cases should be modified.
{code:java}
long procId = ProcedureTestingUtility.submitAndWait(procExec,
    new TruncateTableProcedureOnHDFSFailure(procExec.getEnvironment(), tableName,
        preserveSplits));
//add new code
ProcedureTestingUtility.assertProcNotFailed(procExec, procId);
int regionInMeta = UTIL.getHBaseAdmin().getTableRegions(tableName).size();
Path tablePath = FSUtils.getTableDir(FSUtils.getRootDir(UTIL.getConfiguration()), tableName);
int regionInHDFS = FSUtils.getRegionDirs(FSUtils.getCurrentFileSystem(UTIL.getConfiguration()),tablePath).size();
assertEquals(regionInMeta, regionInHDFS);
{code}


was (Author: bo cui):
[~brfrn169] [~hadoopqa]  HI

!image-2019-12-19-17-48-45-500.png!

In the old code, the regions is recreated, and I do not think there will be duplicate regions...and the result is that the table has invalid region in hdfs...

so test cases should be modified.

!image-2019-12-19-17-56-13-704.png!

> TruncateTableProcedure is stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-20616
>                 URL: https://issues.apache.org/jira/browse/HBASE-20616
>             Project: HBase
>          Issue Type: Bug
>          Components: amv2
>         Environment: HDP-2.5.3
>            Reporter: Toshihiro Suzuki
>            Assignee: Toshihiro Suzuki
>            Priority: Major
>             Fix For: 2.1.0
>
>         Attachments: 20616.master.004.patch, HBASE-20616.master.001.patch, HBASE-20616.master.002.patch, HBASE-20616.master.003.patch, HBASE-20616.master.004.patch
>
>
> At first, TruncateTableProcedure failed to write some files to HDFS in TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason.
> {code:java}
> 2018-05-15 08:00:25,346 WARN  [ProcedureExecutorThread-8] procedure.TruncateTableProcedure: Retriable error trying to truncate table=<namespace>:<table> state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /apps/hbase/data/.tmp/data/<namespace>/<table>/<region>/.regioninfo could only be replicated to 0 nodes instead of minReplication (=1).  There are <the number of DNs> datanode(s) running and no node(s) are excluded in this operation.
> ...
> {code}
> But at this time, seemed like writing some files to HDFS was successful.
> And then, TruncateTableProcedure was stuck in retry loop in TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log messages were shown repeatedly in the master log:
> {code:java}
> 2018-05-15 08:00:25,463 WARN  [ProcedureExecutorThread-8] procedure.TruncateTableProcedure: Retriable error trying to truncate table=<namespace>:<table> state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
> java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: The specified region already exists on disk: hdfs://<name>/apps/hbase/data/.tmp/data/<namespace>/<table>/<region>
> ...
> {code}
> It seems like this is because TruncateTableProcedure tried to write the files that were written successfully in the first try.
> I think we need to delete all the files and directories that are written successfully in the previous try before retrying the TRUNCATE_TABLE_CREATE_FS_LAYOUT state.
> Actually, this issue was observed in HDP-2.5.3, but I think the upstream has the same issue. Also, it looks to me that CreateTableProcedure has a similar issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)