You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "HemaKumar (Jira)" <ji...@apache.org> on 2022/10/02 11:38:00 UTC

[jira] [Created] (HBASE-27404) Long running ExportSnapshot fails with Can't find hfile Exception.

HemaKumar created HBASE-27404:
---------------------------------

             Summary: Long running ExportSnapshot fails with Can't find hfile Exception.
                 Key: HBASE-27404
                 URL: https://issues.apache.org/jira/browse/HBASE-27404
             Project: HBase
          Issue Type: Bug
          Components: snapshots
            Reporter: HemaKumar


ExportSnapshot Jobs running for more than destination cluster hbase.master.hfilecleaner.ttl value, are filing with {_}Can't find hfile: <hile> in the real or archive folders{_}. Copied HFiles in archive folder is getting deleted at the Destination cluster by SnapshotHFileCleaner cleaner.

 
 # Export snapshot moves archived hfiles files to destination archved folders.
 # In progress ExportSnapshot manifest will be there in /hbase/.hbase-snapshot/.tmp till it is completed.
 # in SnapshotHFileCleaner flow, where it is ignoring /hbase/.hbase-snapshot/.tmp directory to find the snapshot reference files,

{code:java}
 

private void refreshCache() throws IOException {
  // just list the snapshot directory directly, do not check the modification time for the root
  // snapshot directory, as some file system implementations do not modify the parent directory's
  // modTime when there are new sub items, for example, S3.
  FileStatus[] snapshotDirs = FSUtils.listStatus(fs, snapshotDir,
    p -> !p.getName().equals(SnapshotDescriptionUtils.SNAPSHOT_TMP_DIR_NAME)); {code}
 # As in progress snapshot reference is missed by SnapshotHFileCleaner. TimeToLiveHFileCleaner marks the HFiles older(coped before hbase.master.hfilecleaner.ttl) than hbase.master.hfilecleaner.ttl to delete from in progress ExportSnapshots dir.
 # This is causing ExportSnapshot to fail at the verification stage.

 

Workaround:

increase hbase.master.hfilecleaner.ttl value to more than the Snapshot ExportSnapshot job run time in the destination cluster.

 

I think this issue needs to be fixed in SnapshotHFileCleaner flow so that long-running ExportSnapshot jobs can succeed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)