You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2018/06/01 09:43:00 UTC

[jira] [Updated] (HBASE-20668) Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call

     [ https://issues.apache.org/jira/browse/HBASE-20668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-20668:
---------------------------
    Attachment: 20668.v1.txt

> Exception from FileSystem operation in finally block of ExportSnapshot#doWork may hide exception from FileUtil.copy call
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20668
>                 URL: https://issues.apache.org/jira/browse/HBASE-20668
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>            Priority: Major
>         Attachments: 20668.v1.txt
>
>
> I was debugging the following error [~romil.choksi] saw during testing ExportSnapshot :
> {code}
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|2018-06-01 02:40:52,358 ERROR [main] util.AbstractHBaseTool: Error      running command-line tool
> 2018-06-01 02:40:52,363|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|java.io.FileNotFoundException: Directory/File does not exist /apps/     hbase/data/.hbase-snapshot/.tmp/snapshot_table_334546
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at org.apache.hadoop.hdfs.server.namenode.FSDirectory.                  checkOwner(FSDirectory.java:1777)
> 2018-06-01 02:40:52,364|INFO|MainThread|machine.py:167 - run()||GUID=1cacb7bc-f7cc-4710-82e0-4a4513f0c1f9|at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.                  setOwner(FSDirAttrOp.java:82)
> {code}
> Here is corresponding code (with extra log added):
> {code}
>     try {
>       LOG.info("Copy Snapshot Manifest from " + snapshotDir + " to " + initialOutputSnapshotDir);
>       boolean ret = FileUtil.copy(inputFs, snapshotDir, outputFs, initialOutputSnapshotDir, false,
>           false, conf);
>       LOG.info("return val = " + ret);
>     } catch (IOException e) {
>       LOG.warn("Failed to copy the snapshot directory: from=" +
>           snapshotDir + " to=" + initialOutputSnapshotDir, e);
>       throw new ExportSnapshotException("Failed to copy the snapshot directory: from=" +
>         snapshotDir + " to=" + initialOutputSnapshotDir, e);
>     } finally {
>       if (filesUser != null || filesGroup != null) {
>         LOG.warn((filesUser == null ? "" : "Change the owner of " + needSetOwnerDir + " to "
>             + filesUser)
>             + (filesGroup == null ? "" : ", Change the group of " + needSetOwnerDir + " to "
>             + filesGroup));
>         setOwner(outputFs, needSetOwnerDir, filesUser, filesGroup, true);
>       }
> {code}
> "return val = " was not seen in rerun of the test.
> This is what the additional log revealed:
> {code}
> 2018-06-01 09:22:54,247|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|2018-06-01 09:22:54,241 WARN  [main] snapshot.ExportSnapshot: Failed to copy the snapshot directory: from=hdfs://ns1/apps/hbase/data/.hbase-snapshot/snapshot_table_157842 to=hdfs://ns3/apps/hbase/data/.hbase-snapshot/.tmp/snapshot_table_157842
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|org.apache.hadoop.security.AccessControlException: Permission denied:   user=hbase, access=WRITE, inode="/apps/hbase/data/.hbase-snapshot/.tmp":hrt_qa:hadoop:dr----x-wT
> 2018-06-01 09:22:54,248|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.          check(FSPermissionChecker.java:399)
> 2018-06-01 09:22:54,249|INFO|MainThread|machine.py:167 - run()||GUID=3961d249-9981-429d-81a8-39c7df53cf58|at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.          checkPermission(FSPermissionChecker.java:255)
> {code}
> It turned out that the exception from {{setOwner}} call in the finally block eclipsed the real exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)