You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Karthik Ranganathan (Created) (JIRA)" <ji...@apache.org> on 2011/10/24 23:28:33 UTC

[jira] [Created] (HBASE-4655) Document architecture of backups

Document architecture of backups
--------------------------------

                 Key: HBASE-4655
                 URL: https://issues.apache.org/jira/browse/HBASE-4655
             Project: HBase
          Issue Type: Sub-task
            Reporter: Karthik Ranganathan
            Assignee: Karthik Ranganathan


Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286809#comment-13286809 ] 

Karthik Ranganathan commented on HBASE-4655:
--------------------------------------------

I think we should add this doc to the HBase book. The code parts of this HBase backups feature is already done. I think the next step is to implement a simple wrapper script, and document that as well.

The tasks are already created, see HBASE-4618 for a list of sub-tasks (tasks 1, 2, 4 and 6 are done, 4 needs to be checked in and closed out).

The next one to look at would be HBASE-4664. Let me add some comments in there about what we came up with internally, and then we can go ahead from there.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture v2.docx, HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151595#comment-13151595 ] 

Karthik Ranganathan commented on HBASE-4655:
--------------------------------------------

<< For '...incremental backups at the Stage 1 (RBU) level', won't the time between step between b and d be 'large' and during the copy time, the list of files could change on you; i.e. when you go to copy a file, it maybe have been removed because it'd been compacted. What do you do in this case? (Your list may not included the compacted file)? >>

We look for the deleted files in .Trash and reclaim. If they are not present, we fail the backup for the region. The backup job runs in loops - the first loop starts out with all regions. The failed regions are output and the second loop works only on the failed regions. The number of loops is configurable - we have defaulted at 5.


<< For "a.The backups rely on the clocks across the various region-servers for determining the point in time to which the edits are re-played", so, say a server is lagging the others by a good bit? When replaying the edits, you'd replay edits from when this lagging server said the backup began? >>

No, right now we just subtract a configurable amount of time (say 5 mins) to the start time of the MR job to keep things simple. We could totally do what you say as an enhancement.

<< How will you know which hlogs to replay? You'll open it and look at first and last edits in the file? Or should we write out metadata files for hlogs? Or is it enough relying on hdfs modtime? >>

The hlog files are of the format hlog.TIMESTAMP, TIMESTAMP is time when log is created. We look at this time to determine the file set. We need all files where TIMESTAMP > start time and TIMESTAMP < finish time. We need the latest file where TIMESTAMP < start time.

                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Doug Meil (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151452#comment-13151452 ] 

Doug Meil commented on HBASE-4655:
----------------------------------

I'll gladly port this to the book, and I'd like to add this in here...
http://hbase.apache.org/book.html#ops.backup
... with the existing backup info.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151589#comment-13151589 ] 

stack commented on HBASE-4655:
------------------------------

Echo Todd #1 remarks.

For '...incremental backups at the Stage 1 (RBU) level', won't the time between step between b and d be 'large' and during the copy time, the list of files could change on you; i.e. when you go to copy a file, it maybe have been removed because it'd been compacted.  What do you do in this case?  (Your list may not included the compacted file)?

For "a.The backups rely on the clocks across the various region-servers for determining the point in time to which the edits are re-played", so, say a server is lagging the others by a good bit?   When replaying the edits, you'd replay edits from when this lagging server said the backup began?

How will you know which hlogs to replay?  You'll open it and look at first and last edits in the file?  Or should we write out metadata files for hlogs?  Or is it enough relying on hdfs modtime?

Looks great K.


                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151508#comment-13151508 ] 

Todd Lipcon commented on HBASE-4655:
------------------------------------

Two quick notes from looking over the doc:

- the names are a little confusing to me - "in-cluster back up" is actually two clusters, right? I'd call your "RBU" an in-cluster backup, I'd call your CBU an "in-datacenter backup", and I'd call your DBU a "cross-datacenter backup", "DR backup", or "BCP backup".

- For RBU, maybe we can get atomicity in a simpler manner by having the region server initiate the copy of hfiles? It can hold the lock to block flushes while the copies happen (they're hard-link copies, right?) 

                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151605#comment-13151605 ] 

stack commented on HBASE-4655:
------------------------------

Sounds good Karthik.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151564#comment-13151564 ] 

Karthik Ranganathan commented on HBASE-4655:
--------------------------------------------

For #1, totally :) internally, we use the term "cluster" to denote a section of the data center (as opposed to the HBase cluster), a data center is composed of a number of "clusters", hence the name. in-DC and cross-DC work.

For #2, this makes the running cluster stall and not take updates for the time period of the copy. It is fast-copy with hard-links underneath, but there is nothing in the current design that would stop it from being used against a remote cluster or a DFS version without the hard-link. Also, if for some reason the hard link fails, it does a deep copy, so it could have longer stalls.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151457#comment-13151457 ] 

Karthik Ranganathan commented on HBASE-4655:
--------------------------------------------

Sounds great Doug! Maybe we make a new section, keep adding stuff in, and deprecate the old stuff? Or whatever works...
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthik Ranganathan updated HBASE-4655:
---------------------------------------

    Attachment: HBase Backups Architecture v2.docx

Made modifications as suggested here, also made certain explanation clearer. Also added a notes/FAQ section based on some questions I have received both here and via email.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture v2.docx, HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285289#comment-13285289 ] 

stack commented on HBASE-4655:
------------------------------

What should we do w/ this doc Karthik?  Seems like still stuff to build out?  Should we make issues for whats to be done?
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture v2.docx, HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165436#comment-13165436 ] 

Karthik Ranganathan commented on HBASE-4655:
--------------------------------------------

@Doug:

<< list all the regions, for each region, ask the RS hosting it for a list of HFiles >>

There is already an API to get a list of regions and the regionservers hosting them. And we added a new API to the RS to list the HFiles for the regions it hosts.

<< The strategy is great, but it will generate a flurry of (warranted) questions on how the average person does it. >>

True - but this task is only to make sure the document is easy to read and understand by an average user. We can definitely add more details if needed, but that would risk confusing people. I will definitely incorporate the other suggestions (confusing names, etc). The rest of the tasks deal with giving a way for the average users to do backups by running/cron-ing a command and not have to deal with the internals of how it works.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthik Ranganathan resolved HBASE-4655.
----------------------------------------

    Resolution: Fixed
    
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture v2.docx, HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281684#comment-13281684 ] 

Karthik Ranganathan commented on HBASE-4655:
--------------------------------------------

Marking as resolved, feel free to send more comments my way in case something is not clear.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture v2.docx, HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4655) Document architecture of backups

Posted by "Doug Meil (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160286#comment-13160286 ] 

Doug Meil commented on HBASE-4655:
----------------------------------

Hi folks, sorry about the delay in commenting.

I liked the refresher on "why backup?" in the beginning.  

I also found some of the names confusing (e.g., RBU, CBU).

The strategy here in the doc is terrific, but I'd like to see this get a little more "actionable" with specifics.  For example in the Stage1 RBU incremental, "list all the regions, for each region, ask the RS hosting it for a list of HFiles".  How is this to be done?  Using Java-API to list regions?  Reading the HBase files from HDFS?  Ostensibly the RS hosting the region has to come from an online API.  The strategy is great, but it will generate a flurry of (warranted) questions on how the average person does it.


 
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4655) Document architecture of backups

Posted by "Karthik Ranganathan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthik Ranganathan updated HBASE-4655:
---------------------------------------

    Attachment: HBase Backups Architecture.docx

Basic HBase backup architecture and the various levels of protection it would offer.
                
> Document architecture of backups
> --------------------------------
>
>                 Key: HBASE-4655
>                 URL: https://issues.apache.org/jira/browse/HBASE-4655
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>         Attachments: HBase Backups Architecture.docx
>
>
> Basic idea behind the backup architecture for HBase

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira