You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Zheng Hu (JIRA)" <ji...@apache.org> on 2018/12/11 13:40:00 UTC

[jira] [Commented] (HBASE-21582) If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then SnapshotHFileCleaner will skip to run every time

    [ https://issues.apache.org/jira/browse/HBASE-21582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16717171#comment-16717171 ] 

Zheng Hu commented on HBASE-21582:
----------------------------------

Read the code, we  cannot just remove  the handler from ConcurentHashMap,  because the isSnapshotDone will throw the handler's exception to user, If we just remove the handler from Map after it be finshied,  then the exception will lost too. So I introduced a period thread  to cleanup the finished handler. 

> If call HBaseAdmin#snapshotAsync but forget call isSnapshotFinished, then SnapshotHFileCleaner will skip to run every time
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21582
>                 URL: https://issues.apache.org/jira/browse/HBASE-21582
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>         Attachments: HBASE-21582.v1.patch
>
>
> This is because we remove the SnapshotSentinel  from snapshotHandlers in SnapshotManager#cleanupSentinels.  Only when the following 3 case, the  cleanupSentinels will be called: 
> 1.  SnapshotManager#isSnapshotDone; 
> 2.  SnapshotManager#takeSnapshot; 
> 3. SnapshotManager#restoreOrCloneSnapshot
> So if no isSnapshotDone called, or no further snapshot taking, or snapshot restore/clone.  the SnapshotSentinel will always be keep in snapshotHandlers. 
> But after HBASE-21387,  Only when no snapshot taking, the SnapshotHFileCleaner will check the unref files and clean. 
> I found this bug, because in our XiaoMi branch-2,  we implement the soft delete feature, which means if someone delete a table, then master will create a snapshot firstly, after that, the table deletion begain.  the implementation is quite simple, we use the snapshotManager to create a snapshot. 
> {code}
> diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> index 8f42e4a..6da6a64 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> @@ -2385,12 +2385,6 @@ public class HMaster extends HRegionServer implements MasterServices {
>        protected void run() throws IOException {
>          getMaster().getMasterCoprocessorHost().preDeleteTable(tableName);
>  
> +        if (snapshotBeforeDelete) {
> +          LOG.info("Take snaposhot for " + tableName + " before deleting");
> +          snapshotManager
> +              .takeSnapshot(SnapshotDescriptionUtils.getSnapshotNameForDeletedTable(tableName));
> +        }
> +
>          LOG.info(getClientIdAuditPrefix() + " delete " + tableName);
>  
>          // TODO: We can handle/merge duplicate request
> {code}
> In the master,  I found the endless log after delete a table: 
> {code}
> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache: Not checking unreferenced files since snapshot is running, it will skip to clean the HFiles this time
> {code}
> This is because the snapshotHandlers never be cleaned after call the  snapshotManager#takeSnapshot.  I think the asynSnapshot may has the same problem. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)