You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Vladimir Rodionov (JIRA)" <ji...@apache.org> on 2016/11/07 19:24:58 UTC

[jira] [Commented] (HBASE-14141) HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits from backup tables

    [ https://issues.apache.org/jira/browse/HBASE-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645161#comment-15645161 ] 

Vladimir Rodionov commented on HBASE-14141:
-------------------------------------------

Two approaches

h5. Direct filtering

We can filter WAL during copy operation to include only edits for a tables being backed up.

Pro:

# Flexibility ?

Contra:

# Complexity. Requires new version of DistCp or other custom MR job
# Complexity. Tracking WALs eligible for deletion becomes a nightmare.  When WAL is closed we need to write all tables in a current incremental backup set in hbase:backup table and create record for every table:WAL with initial state something like WAIT. Then we will have to update state of these records after each incremental backup session and keep WAL until all records are updated. Two issues:
## Atomicity of combination of events: WAL closing and getting list of tables from hbase:backup.
## Handling of table deletion and removing tables from backup set
# Complexity. All backup destination file structures needs to be changed

h5. Have a separate WAL group for tables in backup

We have a dedicated WAL group for tables we want to backup. Table MUST be assigned to this group if we want to backup it. This is something that could be done by  manipulating RegionGroupingStrategy, as [~jinghe] suggested.

Pro:

# Simplicity of implementation

Contra:

# Requires documentation how-to to explain how to optimize backup in case of subset of tables.

I personally like the latter one.



> HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits from backup tables
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14141
>                 URL: https://issues.apache.org/jira/browse/HBASE-14141
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>              Labels: backup
>             Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)