You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org> on 2012/04/26 03:43:18 UTC

[jira] [Commented] (HBASE-5712) Parallelize load of .regioninfo files in diagnostic/repair portion of hbck.

    [ https://issues.apache.org/jira/browse/HBASE-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262308#comment-13262308 ] 

jiraposter@reviews.apache.org commented on HBASE-5712:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4883/
-----------------------------------------------------------

Review request for hbase, Ted Yu and Jimmy Xiang.


Summary
-------

* Parallelized load of .regioninfo files
* changed TreeMap to SortedMap in method signatures
* renamed a test's name.


This addresses bug HBASE-5712.
    https://issues.apache.org/jira/browse/HBASE-5712


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 66156c2 
  src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java 6b64f10 

Diff: https://reviews.apache.org/r/4883/diff


Testing
-------

Ran patch 10x on trunk, passes.  Ran 1x on 0.92 and 0.94.

Ther 0.90 version that is nearly identical except for ignoring changes near lines HBaseFsck lines 671-680.


Thanks,

jmhsieh


                
> Parallelize load of .regioninfo files in diagnostic/repair portion of hbck.
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-5712
>                 URL: https://issues.apache.org/jira/browse/HBASE-5712
>             Project: HBase
>          Issue Type: Sub-task
>          Components: hbck
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>         Attachments: hbase-5712.patch
>
>
> On heavily loaded hdfs's some dfs nodes may not respond quickly and backs off for 60s before attempting to read data from another datanode.  Portions of the information gathered from hdfs (.regioninfo files) are loaded serially.  With HBase with clusters with 100's, or 1000's, or 10000's regions encountering these 60s delay blocks progress and can be very painful.  
> There is already some parallelization of portions of the hdfs information load operations and the goal here is move the reading of .regioninfos into the parallelized sections..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira