You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Karthik Ranganathan (Created) (JIRA)" <ji...@apache.org> on 2011/10/25 06:52:32 UTC

[jira] [Created] (HBASE-4662) Replay the required hlog edits to make the backup preserve row atomicity.

Replay the required hlog edits to make the backup preserve row atomicity.
-------------------------------------------------------------------------

Key: HBASE-4662
URL: https://issues.apache.org/jira/browse/HBASE-4662
Project: HBase
Issue Type: Sub-task
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan

The algorithm is as follows:

A. For HFiles:
1. Need to track t1,t2 for each backup (start and end times of the backup)
2. For point in time restore to time t, pick a HFile snapshot which has t2 < t
3. Copy HFile snapshot to a temp location - HTABLE_RESTORE_t

B. For HLogs:
for each regionserver do
for .logs and .oldlogs do
1. log file is hlog.TIME
2. if (t > TIME and hlog.TIME is open for write) fail restore for t
3. Pick the latest HLog whose create time is < t1
4. Pick all HLogs whose create time is > t1 and <= t2
5. Copy hlogs to the right structures inside HTABLE_RESTORE_t

C. Split logs
1. Enhance HLog.splitLog to take timestamp t
2. Enhance distributed log split tool to pass HTABLE_RESTORE_t, so that log split is picked up and put in the right location
3. Enhance distributed log split tool to pass t so that all edits till t are included and others ignored

D. Import the directory into the running HBase with META entries, etc (this already exists)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-4662) Replay the required hlog edits to make the backup preserve row atomicity.

Posted by "Zhihong Yu (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217709#comment-13217709 ] 

Zhihong Yu edited comment on HBASE-4662 at 2/28/12 12:25 AM:
-------------------------------------------------------------

The taskframework ('cassini') is interesting and maybe useful to implement region deletion - see HBASE-4991

How does cassini handle failure scenarios ?
                
      was (Author: zhihyu@ebaysf.com):
    The taskframework ('cassini') is interesting and maybe useful to implement region deletion - see HBase-4991

How does cassini handle failure scenarios ?
                  
> Replay the required hlog edits to make the backup preserve row atomicity.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4662
>                 URL: https://issues.apache.org/jira/browse/HBASE-4662
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> The algorithm is as follows:
> A. For HFiles:
> 1. Need to track t1,t2 for each backup (start and end times of the backup)
> 2. For point in time restore to time t, pick a HFile snapshot which has t2 < t
> 3. Copy HFile snapshot to a temp location - HTABLE_RESTORE_t
> B. For HLogs:
> for each regionserver do
>   for .logs and .oldlogs do
> 1. log file is hlog.TIME
> 2. if (t > TIME and hlog.TIME is open for write) fail restore for t
> 3. Pick the latest HLog whose create time is < t1
> 4. Pick all HLogs whose create time is > t1 and <= t2
> 5. Copy hlogs to the right structures inside HTABLE_RESTORE_t
> C. Split logs
> 1. Enhance HLog.splitLog to take timestamp t
> 2. Enhance distributed log split tool to pass HTABLE_RESTORE_t, so that log split is picked up and put in the right location
> 3. Enhance distributed log split tool to pass t so that all edits till t are included and others ignored
> D. Import the directory into the running HBase with META entries, etc (this already exists)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4662) Replay the required hlog edits to make the backup preserve row atomicity.

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166670#comment-13166670 ] 

Lars Hofhansl commented on HBASE-4662:
--------------------------------------

Thanks for writing this up!

I had a few questions:

* How do you currently back up your HLogs? Do you have a process that watches .[old]logs and copies/archives every new file appearing there?
* How do you back up the HFiles? Do you issue a flush before you do this?
* That tool you mention in D. Is not completebulkload, right? Will that tool deal with replaying the logs you placed in B.5.?
* I found that distributed log splitting relies on region names in the HLog in order to do the splitting. If any region splits happened after the HLog was written, or this is a new table, the replay will fail for regions that do no longer exist. Do you plan to change the distributed log splitter to deal with this? (It would need to map the rowkeys back to the now-current set of regions.)
* HLogs have entries of many tables. In the approach above whatever replays the log would need to only replay those entries pertaining to the HFiles copied over, right?

Thanks again...
                
> Replay the required hlog edits to make the backup preserve row atomicity.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4662
>                 URL: https://issues.apache.org/jira/browse/HBASE-4662
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> The algorithm is as follows:
> A. For HFiles:
> 1. Need to track t1,t2 for each backup (start and end times of the backup)
> 2. For point in time restore to time t, pick a HFile snapshot which has t2 < t
> 3. Copy HFile snapshot to a temp location - HTABLE_RESTORE_t
> B. For HLogs:
> for each regionserver do
>   for .logs and .oldlogs do
> 1. log file is hlog.TIME
> 2. if (t > TIME and hlog.TIME is open for write) fail restore for t
> 3. Pick the latest HLog whose create time is < t1
> 4. Pick all HLogs whose create time is > t1 and <= t2
> 5. Copy hlogs to the right structures inside HTABLE_RESTORE_t
> C. Split logs
> 1. Enhance HLog.splitLog to take timestamp t
> 2. Enhance distributed log split tool to pass HTABLE_RESTORE_t, so that log split is picked up and put in the right location
> 3. Enhance distributed log split tool to pass t so that all edits till t are included and others ignored
> D. Import the directory into the running HBase with META entries, etc (this already exists)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4662) Replay the required hlog edits to make the backup preserve row atomicity.

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217709#comment-13217709 ] 

Zhihong Yu commented on HBASE-4662:
-----------------------------------

The taskframework ('cassini') is interesting and maybe useful to implement region deletion - see HBase-4991

How does cassini handle failure scenarios ?
                
> Replay the required hlog edits to make the backup preserve row atomicity.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4662
>                 URL: https://issues.apache.org/jira/browse/HBASE-4662
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> The algorithm is as follows:
> A. For HFiles:
> 1. Need to track t1,t2 for each backup (start and end times of the backup)
> 2. For point in time restore to time t, pick a HFile snapshot which has t2 < t
> 3. Copy HFile snapshot to a temp location - HTABLE_RESTORE_t
> B. For HLogs:
> for each regionserver do
>   for .logs and .oldlogs do
> 1. log file is hlog.TIME
> 2. if (t > TIME and hlog.TIME is open for write) fail restore for t
> 3. Pick the latest HLog whose create time is < t1
> 4. Pick all HLogs whose create time is > t1 and <= t2
> 5. Copy hlogs to the right structures inside HTABLE_RESTORE_t
> C. Split logs
> 1. Enhance HLog.splitLog to take timestamp t
> 2. Enhance distributed log split tool to pass HTABLE_RESTORE_t, so that log split is picked up and put in the right location
> 3. Enhance distributed log split tool to pass t so that all edits till t are included and others ignored
> D. Import the directory into the running HBase with META entries, etc (this already exists)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4662) Replay the required hlog edits to make the backup preserve row atomicity.

Posted by "Karthik Ranganathan (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217926#comment-13217926 ] 

Karthik Ranganathan commented on HBASE-4662:
--------------------------------------------

Lots of discussion on HBASE-4991... could someone update the description of the JIRA to more accurately reflect the intent? (Didnt read through all the discussion, but the following questions come to mind: Is it just "delete regions in HBase"? Is that similar to merge - then why not use merge? etc so not sure how this can be used there)

<< How does cassini handle failure scenarios ? >>
The various processes leader elect among themselves - one of them is the master. It moves tasks that are assigned to failed processes to unassigned state after a timeout.
                
> Replay the required hlog edits to make the backup preserve row atomicity.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4662
>                 URL: https://issues.apache.org/jira/browse/HBASE-4662
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> The algorithm is as follows:
> A. For HFiles:
> 1. Need to track t1,t2 for each backup (start and end times of the backup)
> 2. For point in time restore to time t, pick a HFile snapshot which has t2 < t
> 3. Copy HFile snapshot to a temp location - HTABLE_RESTORE_t
> B. For HLogs:
> for each regionserver do
>   for .logs and .oldlogs do
> 1. log file is hlog.TIME
> 2. if (t > TIME and hlog.TIME is open for write) fail restore for t
> 3. Pick the latest HLog whose create time is < t1
> 4. Pick all HLogs whose create time is > t1 and <= t2
> 5. Copy hlogs to the right structures inside HTABLE_RESTORE_t
> C. Split logs
> 1. Enhance HLog.splitLog to take timestamp t
> 2. Enhance distributed log split tool to pass HTABLE_RESTORE_t, so that log split is picked up and put in the right location
> 3. Enhance distributed log split tool to pass t so that all edits till t are included and others ignored
> D. Import the directory into the running HBase with META entries, etc (this already exists)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4662) Replay the required hlog edits to make the backup preserve row atomicity.

Posted by "Karthik Ranganathan (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217682#comment-13217682 ] 

Karthik Ranganathan commented on HBASE-4662:
--------------------------------------------

Missed this one:


<< How do you currently back up your HLogs? Do you have a process that watches .[old]logs and copies/archives every new file appearing there? >>
We have written a taskframework (code name 'cassini'). The framework is logically the equivalent of a distributed-threadpool. It manages N worker threads (one per regionserver) across M machines (destination backup machines for example) using ZK as the persistent store for the queue of tasks. It can run plugins that are coded to some requirements to do arbitrary work. That framework has a plugin which we have implemented to tail and play logs. Will put that one out soon.

<< How do you back up the HFiles? Do you issue a flush before you do this?
That tool you mention in D. Is not completebulkload, right? Will that tool deal with replaying the logs you placed in B.5.? >>
The above 2 are in the diff. Yes, we issue a flush, and there is a custom tool. HLog replays are not done yet, we have an initial diff which we have not yet productized.

<< I found that distributed log splitting relies on region names in the HLog in order to do the splitting. If any region splits happened after the HLog was written, or this is a new table, the replay will fail for regions that do no longer exist. Do you plan to change the distributed log splitter to deal with this? (It would need to map the rowkeys back to the now-current set of regions.) >>

<< HLogs have entries of many tables. In the approach above whatever replays the log would need to only replay those entries pertaining to the HFiles copied over, right? >>
Yes, and potentially take care of changed table names (export from table A, import as table B).
                
> Replay the required hlog edits to make the backup preserve row atomicity.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-4662
>                 URL: https://issues.apache.org/jira/browse/HBASE-4662
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> The algorithm is as follows:
> A. For HFiles:
> 1. Need to track t1,t2 for each backup (start and end times of the backup)
> 2. For point in time restore to time t, pick a HFile snapshot which has t2 < t
> 3. Copy HFile snapshot to a temp location - HTABLE_RESTORE_t
> B. For HLogs:
> for each regionserver do
>   for .logs and .oldlogs do
> 1. log file is hlog.TIME
> 2. if (t > TIME and hlog.TIME is open for write) fail restore for t
> 3. Pick the latest HLog whose create time is < t1
> 4. Pick all HLogs whose create time is > t1 and <= t2
> 5. Copy hlogs to the right structures inside HTABLE_RESTORE_t
> C. Split logs
> 1. Enhance HLog.splitLog to take timestamp t
> 2. Enhance distributed log split tool to pass HTABLE_RESTORE_t, so that log split is picked up and put in the right location
> 3. Enhance distributed log split tool to pass t so that all edits till t are included and others ignored
> D. Import the directory into the running HBase with META entries, etc (this already exists)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira