You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org> on 2011/10/24 19:13:32 UTC

[jira] [Commented] (HBASE-4650) Update LoadIncrementalHFiles to use atomic bulk load RS mechanism

    [ https://issues.apache.org/jira/browse/HBASE-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134253#comment-13134253 ] 

Jonathan Hsieh commented on HBASE-4650:
---------------------------------------

I'm in the process of cleaning up the modifications to LoadIncrementalHFiles and adding more tests before submitting HBASE-4650 for a for review.  This cut passes the two tests that use LoadIncrementalHFiles (TestLoadIncrementalHFiles and TestHFileOutputFormat).  I'll post a preliminary version for those interested.

In the code, there are significant logic changes due to grouping so I've chosen to take out the concurrency on the first cut because gathering and splitting HFiles into proper groups introduces a synchronization point that prevents some of the concurrency as before.  This is because groups need to be fully gathered before bulk loads in a region is attempted.  I'll include comments where concurrency is ok.  

Before I spend effort to parallelize this implementation more, I want to add another test to verify that this works while splits are going on.

                
> Update LoadIncrementalHFiles to use atomic bulk load RS mechanism
> -----------------------------------------------------------------
>
>                 Key: HBASE-4650
>                 URL: https://issues.apache.org/jira/browse/HBASE-4650
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Jonathan Hsieh
>             Fix For: 0.92.0
>
>
> MR jobs and command line bulk load execution runs use the LoadIncrementalHFile.doBulkLoad.  This needs to be updated to group HFiles by row/region so that rows can be atomically loaded multiple column families.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira