You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Vladimir Rodionov (JIRA)" <ji...@apache.org> on 2016/02/25 23:47:18 UTC

[jira] [Updated] (HBASE-15331) HBase Backup/Restore Phase 2: Optimized Restore operation

     [ https://issues.apache.org/jira/browse/HBASE-15331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vladimir Rodionov updated HBASE-15331:
--------------------------------------
    Description: The current implementation for restore uses WALReplay M/R job. This has performance and stability problems, since it uses HBase client API to insert data. We have to migrate to bulk load approach: generate hfiles directly from snapshot and incremental images. We run separate M/R job for every backup image between last FULL backup and current incremental backup we restore to and for every table in the list (image). If we have 10 tables and 30 days of incremental backup images - this results in 30x10 = 300 M/R jobs.  (was: The current implementation for restore uses WALReplay M/R job. This has performance and stability problems, since it uses HBase client API to insert data. We have to migrate to bulk load approach: generate hfiles directly from snapshot and incremental images.)

> HBase Backup/Restore Phase 2: Optimized Restore operation
> ---------------------------------------------------------
>
>                 Key: HBASE-15331
>                 URL: https://issues.apache.org/jira/browse/HBASE-15331
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>
> The current implementation for restore uses WALReplay M/R job. This has performance and stability problems, since it uses HBase client API to insert data. We have to migrate to bulk load approach: generate hfiles directly from snapshot and incremental images. We run separate M/R job for every backup image between last FULL backup and current incremental backup we restore to and for every table in the list (image). If we have 10 tables and 30 days of incremental backup images - this results in 30x10 = 300 M/R jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)