You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "HemaKumar (Jira)" <ji...@apache.org> on 2022/10/02 11:38:00 UTC
[jira] [Created] (HBASE-27404) Long running ExportSnapshot fails with Can't find hfile Exception.
HemaKumar created HBASE-27404:
---------------------------------
Summary: Long running ExportSnapshot fails with Can't find hfile Exception.
Key: HBASE-27404
URL: https://issues.apache.org/jira/browse/HBASE-27404
Project: HBase
Issue Type: Bug
Components: snapshots
Reporter: HemaKumar
ExportSnapshot Jobs running for more than destination cluster hbase.master.hfilecleaner.ttl value, are filing with {_}Can't find hfile: <hile> in the real or archive folders{_}. Copied HFiles in archive folder is getting deleted at the Destination cluster by SnapshotHFileCleaner cleaner.
# Export snapshot moves archived hfiles files to destination archved folders.
# In progress ExportSnapshot manifest will be there in /hbase/.hbase-snapshot/.tmp till it is completed.
# in SnapshotHFileCleaner flow, where it is ignoring /hbase/.hbase-snapshot/.tmp directory to find the snapshot reference files,
{code:java}
private void refreshCache() throws IOException {
// just list the snapshot directory directly, do not check the modification time for the root
// snapshot directory, as some file system implementations do not modify the parent directory's
// modTime when there are new sub items, for example, S3.
FileStatus[] snapshotDirs = FSUtils.listStatus(fs, snapshotDir,
p -> !p.getName().equals(SnapshotDescriptionUtils.SNAPSHOT_TMP_DIR_NAME)); {code}
# As in progress snapshot reference is missed by SnapshotHFileCleaner. TimeToLiveHFileCleaner marks the HFiles older(coped before hbase.master.hfilecleaner.ttl) than hbase.master.hfilecleaner.ttl to delete from in progress ExportSnapshots dir.
# This is causing ExportSnapshot to fail at the verification stage.
Workaround:
increase hbase.master.hfilecleaner.ttl value to more than the Snapshot ExportSnapshot job run time in the destination cluster.
I think this issue needs to be fixed in SnapshotHFileCleaner flow so that long-running ExportSnapshot jobs can succeed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)