You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2013/03/14 20:10:14 UTC

[jira] [Commented] (HBASE-8109) HBase can manage blocks instead of files in HDFS

    [ https://issues.apache.org/jira/browse/HBASE-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602592#comment-13602592 ] 

Lars Hofhansl commented on HBASE-8109:
--------------------------------------

Due, you're talking heresy here :)

This sounds like a potentially good idea.
Won't you end up writing something similar to the name node again?
Also, what happens then with M/R on top what are now HFiles?

Lastly, why not go further then? Why use HDFS at all? We could replicate at the RS level (which would also allow us to distribute reads across multiple servers).

                
> HBase can manage blocks instead of files in HDFS
> ------------------------------------------------
>
>                 Key: HBASE-8109
>                 URL: https://issues.apache.org/jira/browse/HBASE-8109
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Sergey Shelukhin
>
> Prompted by previous non-Hadoop experience and some dev list discussions, and after talking to some HDFS people about blocks.
> HBase could improve a lot by managing HDFS blocks instead of files, and reusing the blocks among other things. Some areas that could improve are splits, compactions, management of large blobs, locality enforcement.
> I was told that block APIs in Hadoop 2 are well-isolated, but not exposed yet. They can easily be exposed, and as one of the first potential users we could get to help shape them. Two areas that from my limited understanding is currently fuzzy are namespaces for blocks, and ref-counting.
> We should come up with list of initial scenarios to figure out what we need from block API (locality, detecting/enforcing block boundary/variable size blocks, reusing one block, ...).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira