You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Duo Zhang (Jira)" <ji...@apache.org> on 2021/09/13 01:50:00 UTC

[jira] [Resolved] (HBASE-25891) Remove dependence on storing WAL filenames for backup

     [ https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Duo Zhang resolved HBASE-25891.
-------------------------------
    Hadoop Flags: Reviewed
      Resolution: Fixed

Pushed to master. Thanks [~rda3mon] for contributing and [~stack] for reviewing.

Please fill the release note [~rda3mon] if necessary.

Thanks.

> Remove dependence on storing WAL filenames for backup
> -----------------------------------------------------
>
>                 Key: HBASE-25891
>                 URL: https://issues.apache.org/jira/browse/HBASE-25891
>             Project: HBase
>          Issue Type: Improvement
>          Components: backup&amp;restore
>    Affects Versions: 3.0.0-alpha-1
>            Reporter: Mallikarjun
>            Assignee: Mallikarjun
>            Priority: Major
>             Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, timestamp=1622003479895, value=backup_1622003358258 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, timestamp=1622003479895, value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, timestamp=1622003479895, value=backup_1622003358258 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before taking backup and stores what was the timestamp at which log roll was performed per regionserver per backup using following format.  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 column=meta:rs-log-ts, timestamp=1622887363275, value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and are being used. 
> *Use Case 1.*
> *Existing Design:* To cleanup WAL's for which backup is already taken using `BackupLogCleaner`. Which uses this references to clean up backed up logs.
> *New Design:*
> Since log roll timestamp is stored as part of backup per regionserver. We can check all previous successfull backup's and then identify which logs are to be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed for any table backed can be removed by `BackupLogCleaner` 
>  
> *Use Case 2.* 
> *Existing Design:* During incremental backup, to check system table if there are any duplicate WAL's for which backup is taken again. 
> *New Design:*
>  * Incremental backup already identifies which all WAL's to be backed up using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)