You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Matteo Bertozzi (JIRA)" <ji...@apache.org> on 2012/06/18 20:31:42 UTC

[jira] [Created] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Matteo Bertozzi created HBASE-6233:
--------------------------------------

             Summary: [brainstorm] snapshots: hardlink alternatives
                 Key: HBASE-6233
                 URL: https://issues.apache.org/jira/browse/HBASE-6233
             Project: HBase
          Issue Type: Brainstorming
            Reporter: Matteo Bertozzi
            Assignee: Matteo Bertozzi


Discussion ticket around snapshots and hardlink alternatives.
(See the HDFS-3370 discussion about hardlink and implementation problems)

(taking for a moment WAL out of the discussion and focusing on hfiles)
With hardlinks available taking snapshot will be fairly easy:
* (hfiles are immutable)
* hardlink to .snapshot/name to take snapshot
* hardlink from .snapshot/name to restore the snapshot
* No code change needed (on fs.delete() only one reference is deleted)

but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409257#comment-13409257 ] 

stack commented on HBASE-6233:
------------------------------

It'd be a radical change Matteo.  It feels like a hbase 2.0 kinda thing rather than a 0.96-type change (But this is a 'brainstorm' issue so we have license to talk hypotheticals).

I think we could auto-migrate from the one format to the new; new hfiles would be written into new location in hdfs while we'd read the old unmigrated hfiles from the old layout ("Policy" for compatibility up to this is that versions go forward perhaps w/ a "migration step" but preferably not and we do not have to support reverting an upgrade... thats "policy" so far).

Would we need x-row transactions updating files in .META.?  I don't think so.  Read/write locks might be enough.

We might need to let .META. split now that it can grow largish fast.

We've had "interesting" issues updating .META. in the past: e.g. socket timeout on client side but the edit went through anyways.... that kinda thing.  Now the repercussions of failed or false positive fail will be larger?

Yeah, instead of looking inside hdfs, hbck will have to read .META.  In hdfs, we'd still have tables and regions, or not?




                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408711#comment-13408711 ] 

Zhihong Ted Yu commented on HBASE-6233:
---------------------------------------

Nice writeup.
Although we don't know when HDFS-3370 would be implemented, hdfs snapshot v1 would be delivered later this year.
Do we want to incur extra complexity in our codebase for the hadoop versions where there is no hdfs snapshot ?
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408762#comment-13408762 ] 

stack commented on HBASE-6233:
------------------------------

@Matteo Thanks for taking the time to do the writeup.  Helpful.  I like how your symlink work would make it so no work moving up on to hdfs hard links.

I was wondering if you have any concern around creation of all the symlinks on a table of some decent size taking a good bit of time Matteo?  The window during which the snapshot is being made could be pretty wide.  Would that be a problem?

You ask for ideas and the only one I have is the hackneyed one copied from bdbje where on compaction, we do not delete files; rather we just rename them w/ a '.del' ending and leave them in place.  On snapshot, we make a manifest of all files in the table.  On restore, we'd read the manifest and look for files first w/o the .del and then if not found, with the .del.  I've not thought it all through to the extent of your attached pdf -- I can see how it could get tangled pretty quickly -- but throwing it up there since you were asking.

bq. ...and can be an external tool or an internal thread that scan the snapshot.

Could hitch a ride on the current meta scanner, the one that cleans the parent regions from .META.

Adding list of files to .META. might make for our being able to do other fancyness such as the Accumulo fast table copy, etc.

Let me read your doc. some more (and Jesse's work).


                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matteo Bertozzi updated HBASE-6233:
-----------------------------------

    Attachment: Restore-Snapshot-Hardlink-alternatives.pdf

I've attached a document that tries to describe the hardlink alternatives (Reference Files, .META. Ref-count, Move & SymLink) in relation to the restore and mount operations.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408718#comment-13408718 ] 

Zhihong Ted Yu commented on HBASE-6233:
---------------------------------------

bq. Are you talking about hdfs snapshot or hdfs hardlink?
hdfs snapshot. I have a sense that hdfs hardlink wouldn't make it into open source.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401204#comment-13401204 ] 

Zhihong Ted Yu commented on HBASE-6233:
---------------------------------------

>From discussion of HDFS-3370, it is unknown when hdfs hardlink would get accepted.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401118#comment-13401118 ] 

Jonathan Hsieh commented on HBASE-6233:
---------------------------------------

[~mbertozzi]  Are you suggesting that .snapshot/files is a separate directory from the actual snapshot dirs such that the .snapshot/files dir be all the files, and both the real table and the snapshot tables use symlinks?

Would it be ok for a files reading from a snapshot mount to take the exception and then retry by reopening at the other location?  


                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409066#comment-13409066 ] 

stack commented on HBASE-6233:
------------------------------

Looking at the doc. again:

Is there a table dir missing from this: "● /hbase/.snapshots/<snapshot name>/<region>/<cf>/<hfiles>"?

We have a filter in front of Filesystem now, HFileSystem.  We could instrument 'delete' moving file rather than deleting it a snapshot has happened and we want to keep deleted files around.  I thought we could implement link here too calling through if reflection determines it present and doing whatever the alternative is when its not there (would be some ugly casting to HFileSystem I suppose).

Its 1000ft view, I know, but restoring snapshot, won't we have to create the table directory structure to move the hardlinked hfiles back into place?

On keeping refcount in .META., Enis's suggestion over in HBASE-6205...

bq. When a file is deleted due to a compaction/region deletion we need to move that file somewhere and update all the references.

bq. Also having lots of file can slow down the .META. operations.

We have to move it?  We can't just decrement references?  We'd have to undo the association of files with particular regions -- the layout under ${HBASE.ROOTDIR} would not be as it is now.  We'd present a logical view that was detached from how the hfiles were stored in hdfs.

Other advantages of the refcount in .META. would be no need of moving files around or of keeping refs in hdfs... as many refs as snapshots.

I think the below will take a good amount of time on a loaded table of significant size (say ten region cluster with a table with 100 regions per node with say two column families with say three storefiles each):

{code}
○ Move the hfile to archive
○ Create a symlink to point to the archived file
○ Create a symlink for the snapshot
{code}

Even meta operations on namenode can take a good bit of time.

Restores would be fast (You can symlink a directory? I've not used them).

Reference files does have the advantage you suggest that it'll be easy to move to hardlinks from symlinks (but again, I see the ops taking a long time, even if just meta ops -- is it ok that a snapshot takes a good amount of time... minutes?)

Your doc. is good Matteo.
I think it'll take 
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Comment Edited] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408718#comment-13408718 ] 

Zhihong Ted Yu edited comment on HBASE-6233 at 7/7/12 5:03 PM:
---------------------------------------------------------------

bq. Are you talking about hdfs snapshot or hdfs hardlink?
hdfs snapshot. I have a sense that hdfs hardlink wouldn't make it into open source.

One other aspect is the timing of releases for hdfs snapshot and HBase snapshot (0.96 presumably). If the two are close enough (or hdfs snapshot being earlier a little), does it make sense to recommend customers upgrade both hdfs and HBase at the same time ?
                
      was (Author: zhihyu@ebaysf.com):
    bq. Are you talking about hdfs snapshot or hdfs hardlink?
hdfs snapshot. I have a sense that hdfs hardlink wouldn't make it into open source.
                  
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408749#comment-13408749 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

{quote}So structural changes are needed for symlink approach to work. We should carefully evaluate the pros and cons of maintaining this new logic.{quote}
The cleanup is needed only to remove archived hfiles used by the snapshots,
and can be an external tool or an internal thread that scan the snapshot.
Is not only for the symlink approach but for all three, with the exception for .META. refcount that can run a fs.delete() automatically when refcount reaches zero.

(In the Jesse implementation HBASE-6055 there's already a cleanup tool implemented)
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matteo Bertozzi updated HBASE-6233:
-----------------------------------

    Attachment: Restore-Snapshot-Hardlink-alternatives-v2.pdf

Updated the doc to cover the different hbase.root file-system layout idea.
Removed the extra symlink for snapshot in the "Move & Symlink" approach.
And added some notes about why we can't just rely on .META. refcount with the current layout.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives-v2.pdf, Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408771#comment-13408771 ] 

stack commented on HBASE-6233:
------------------------------

bq. The time is (fs.rename() * nfiles + fs.symlink() * nfiles), but is just a metadata operation on HDFS.

Can take a while I've found.  Something to be aware of.

bq. And by doing this you need to add some logic to the current code to don't read the .del files

Yes.  It'd be ugly especially compared to symlinking w/ refcounting.

bq. So we just need to come up with an alternative to hardlink....

Smile.  Yes.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409212#comment-13409212 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

{quote}
We have to move it? We can't just decrement references? We'd have to undo the association of files with particular regions – the layout under HBASE.ROOTDIR would not be as it is now. We'd present a logical view that was detached from how the hfiles were stored in hdfs.
{quote}
+1 on this. If we're going to change the HBASE.ROOTDIR layout everything will be much easier. Since we just need a "flat" folder that contains all the hfiles and each table can keep track of its own file by scanning .META. in this way we can really use the ref-count and we don't have to move the files around.

But again, this require code changes and changes in how the data is stored (What are the policy for compatibility?).
Also while fixing .META. problems with hbck is useful to look inside the the /hbase/<table> directory to see which files are present in a particoular table.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408746#comment-13408746 ] 

Zhihong Ted Yu commented on HBASE-6233:
---------------------------------------

bq. But we can have a cleanup “tool” as the other alternatives (Reference Files, .META refcount).
So structural changes are needed for symlink approach to work. We should carefully evaluate the pros and cons of maintaining this new logic.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396133#comment-13396133 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

Taking a snapshot means keeping references to hfiles currently present in the table. 
Unfortunately during compaction and split these file can be removed from the original location.

One solution can be:
 * move hfiles to .snapshot/files directory, during the "take snapshot" operation
 * create symlinks to .snapshot/files in the table folder.
 * create the snapshot reference in .snapshot/name/files

This allows to restore snapshot easily by creating a symlink. 
The hbase code will not change since compaction can still call fs.delete() and ends up deleting just the symlink

One problem is that during the fs.rename() + fs.createSymlink() the filename is not available and if DFSInputStream.callGetBlockLocations() is called you end up with a FileNotFoundException.

_But this means that after a snapshot the files in the table folder are symlink, so you'll see both normal files + symlink during the table life._
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408716#comment-13408716 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

{quote}Do we want to incur extra complexity in our codebase for the hadoop versions where there is no hdfs snapshot?{quote}
Are you talking about hdfs snapshot or hdfs hardlink?

I don't think that hbase can rely on hdfs snapshot (E.g. memstore, region info, need to be handled in a special way)

For the missing hdfs hardlink support, I think that what I'm trying to propose simplify a lot the snapshot, since we don't need to change the current code to handle hfile deletions.

but I want some feedback on this, anyone has other suggestions/ideas?
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408763#comment-13408763 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

{quote}
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401198#comment-13401198 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

@Jon
Yes on "Take snapshot" you rename the hfile to .snapshot/files directory and replace it with a symlink.
Also you need to create a symlink in .snapshot/name/ folder (the one that describe the snapshot).
When you want to restore you have just to create a symlink of the file.

I see two advantages for using this approach:
One is code remain unchanged fs.delete() stay fs.delete() (all the "symlink" code is done in takeSnapshot() and nothing change from the hbase point of view)

The other one is: 
 * hbase 0.96 ship with snapshots (hardlink alternative)
 * hbase 0.98 ship with snapshot + hdfs hardlink
If you use the approach that I've described a user that have taken snapshots using 0.96 doesn't have to do nothing special to migrate to 0.98. symlink to .snapshot/files/ keeps to work. And the future 'take snapshot' just create hardlink in .snapshot/name/ and restore as another hardlink against .snapshot/name

In the other case (take the exception and retry) you need to keep the logic in 0.98 or do some fancy script that search for the "Reference" files and replace with the hardlink.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6233) [brainstorm] snapshots: hardlink alternatives

Posted by "Matteo Bertozzi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408764#comment-13408764 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

{quote}I was wondering if you have any concern around creation of all the symlinks on a table of some decent size taking a good bit of time Matteo? The window during which the snapshot is being made could be pretty wide. Would that be a problem?
{quote}
The time is (fs.rename() * nfiles + fs.symlink() * nfiles), but is just a metadata operation on HDFS. I don't have the times for how long it takes but I can come up with some benchmark, maybe with hdfs under heavy load.

Anyway, you need to keep track of the files in some way: create one reference file for each files or add a reference in .META. and both seems much more heavier since they require interaction with both namenode + datanode.

{quote}we do not delete files; rather we just rename them w/ a '.del' ending and leave them in place.{quote}
But if you want to remove the table this files should be moved.
And by doing this you need to add some logic to the current code to don't read the .del files

{quote}
Adding list of files to .META. might make for our being able to do other fancyness such as the Accumulo fast table copy, etc.
{quote}
The accumulo clone table is one of the feature that we can easily get with snapshots.
I've called it "mount snapshot" that essentially is the accumulo clone table. (Take a look at HBASE-6353, for a description of the snapshot operations).

Again, if you think at restore with the hardlink support you can easily have everything. So we just need to come up with an alternative to hardlink.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira