You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/02/06 09:51:59 UTC

[jira] Commented: (HBASE-61) [hbase] Create an HBase-specific MapFile implementation

    [ https://issues.apache.org/jira/browse/HBASE-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671060#action_12671060 ] 

stack commented on HBASE-61:
----------------------------

Ryan checked in his rfile over here on github: http://github.com/ryanobjc/hbase-rfile/tree/master

Its up on github so more than one person can bang on it.  Notion is first to test rfile vs tfile vs mapfile (I checked in latest hfile into github for contrast) and then whichever wins, make a patch out of the github for this issue.

I added to github an evaluate RFile using PE.  RFile is ahead of MF it looks like using an 8k buffer and 10byte cells.  Tomorrow will do more work ensuring all files are returning what they are supposed to and will try compare on dfs.

Talked to AJ also to day.  Suggested playing with pread -- DFSDataIS has one -- so file can be more 'live'.  Suggested also removing buffering on DFSDIS since we're reading in blocks and suggested we also look at receive socket buffer size -- maybe add our own socket factory and if block size < socket receive buffer size, use the smaller.

> [hbase] Create an HBase-specific MapFile implementation
> -------------------------------------------------------
>
>                 Key: HBASE-61
>                 URL: https://issues.apache.org/jira/browse/HBASE-61
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: io
>            Reporter: Bryan Duxbury
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: cpucalltreetfile.html, hfile.patch, hfile2.patch, hfile3.patch, longestkey.patch, tfile.patch, tfile3.patch
>
>
> Today, HBase uses the Hadoop MapFile class to store data persistently to disk. This is convenient, as it's already done (and maintained by other people :). However, it's beginning to look like there might be possible performance benefits to be had from doing an HBase-specific implementation of MapFile that incorporated some precise features.
> This issue should serve as a place to track discussion about what features might be included in such an implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.