You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2016/10/10 20:50:21 UTC

[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading

    [ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563463#comment-15563463 ] 

Ted Yu commented on HBASE-14417:
--------------------------------

While working on BackupHFileCleaner, the counterpart to ReplicationHFileCleaner, I notice the potential impact on the server hosting hbase:backup because we need to have up-to-date information on the hfiles which are still referenced by the incremental backup.

One potential approach is to store hfile information in zookeeper.
This would also alleviate the issue mentioned above about reducing writing BulkLoadDescriptor's to hbase:backup table.

Any suggestions, [~mbertozzi] [~vrodionov] ?

> Incremental backup and bulk loading
> -----------------------------------
>
>                 Key: HBASE-14417
>                 URL: https://issues.apache.org/jira/browse/HBASE-14417
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Ted Yu
>            Priority: Critical
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: 14417.v1.txt, 14417.v2.txt, 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading bypasses WALs for obvious reasons, breaking incremental backups. The only way to continue backups after bulk loading is to create new full backup of a table. This may not be feasible for customers who do bulk loading regularly (say, every day).
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)