You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/06/05 23:12:05 UTC

[jira] Commented: (HBASE-50) Snapshot of table

    [ https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875941#action_12875941 ] 

stack commented on HBASE-50:
----------------------------

Here are some comments on the design requirments Li:

+ Add a date, add a link to this issue to give your design context.
+ FYI, there has been talk of adding snapshots to hdfs.  Its mentioned here: http://hadoop.apache.org/common/docs/current/hdfs_design.html#Snapshots.  The issue is stalled at the moment: HDFS-233.
+ I don't think you should take on requirement 1), only the hbase admin can create a snapshot.  There is no authentication/access control in hbase currently -- its coming but not here yet -- and without it, this would be hard for you to enforce.
+ Regards requirement 2., I'd suggest that how the snapshot gets copied out from under hbase should also be outside the scope of your work.  I'd say your work is making a viable snapshot that can be copied with perhaps some tests to prove it works -- that might copy off data -- but in general, i'd say how actual copying is done is outside of the scope of this issue.  
+ Requirement 6., resuming from a snapshot, yes, this is in scope (how the stuff is copied into place I'd argue is not.  Of course, if you have the time to work on copy out and copy in functionality, great, but I'd peg this lower priority).
+ Otherwise, the requirements are great.



> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Li Chongxin
>            Priority: Minor
>         Attachments: HBase Snapshot Design Report V2.pdf, snapshot-src.zip
>
>
> Havening an option to take a snapshot of a table would be vary useful in production.
> What I would like to see this option do is do a merge of all the data into one or more files stored in the same folder on the dfs. This way we could save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. Say I had a read_only table that must be online. I could take a snapshot of it when needed and export it to a separate data center and have it loaded there and then i would have it online at multi data centers for load balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect from failed servers, but this does not protect use from software bugs that might delete or alter data in ways we did not plan. We should have a way we can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.