You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (Commented) (JIRA)" <ji...@apache.org> on 2011/11/16 04:09:03 UTC
[jira] [Commented] (HBASE-4797) [availability] Give recovered.edits files better names, ones that include first and last sequence id so we can skip files with edits we know older than current region has

    [ https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150992#comment-13150992 ] 

stack commented on HBASE-4797:
------------------------------

Oh... i suppose its a bit worse than I though.  I'm looking at a region that has nearly 6k recovered.edits files to replay.  The RegionServer is doing this per file:

{code}
2011-11-16 03:06:02,403 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Applied 0, skipped 33, firstSequenceidInLog=296860, maxSequenceidInLog=351600, path=hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000296860
2011-11-16 03:06:02,405 INFO org.apache.hadoop.hbase.regionserver.HRegion: Replaying edits from hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000296914; minSequenceid=351600; path=hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000296914
2011-11-16 03:06:05,097 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:7003-0x133a5bab186271f Attempting to transition node 69ab6eb0e2feff1fda52d36d8fa75798 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2011-11-16 03:06:05,278 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:7003-0x133a5bab186271f Successfully transitioned node 69ab6eb0e2feff1fda52d36d8fa75798 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2011-11-16 03:06:05,278 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Applied 0, skipped 33, firstSequenceidInLog=296914, maxSequenceidInLog=351600, path=hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000296914
2011-11-16 03:06:05,279 INFO org.apache.hadoop.hbase.regionserver.HRegion: Replaying edits from hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000296970; minSequenceid=351600; path=hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000296970
2011-11-16 03:06:05,952 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:7003-0x133a5bab186271f Attempting to transition node 69ab6eb0e2feff1fda52d36d8fa75798 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2011-11-16 03:06:06,093 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:7003-0x133a5bab186271f Successfully transitioned node 69ab6eb0e2feff1fda52d36d8fa75798 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2011-11-16 03:06:06,093 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Applied 0, skipped 44, firstSequenceidInLog=296970, maxSequenceidInLog=351600, path=hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000296970
2011-11-16 03:06:06,094 INFO org.apache.hadoop.hbase.regionserver.HRegion: Replaying edits from hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000297041; minSequenceid=351600; path=hdfs://sv4r11s38:7000/hbase/TestTable/69ab6eb0e2feff1fda52d36d8fa75798/recovered.edits/0000000000000297041
2011-11-16 03:06:06,795 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:7003-0x133a5bab186271f Attempting to transition node 69ab6eb0e2feff1fda52d36d8fa75798 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
2011-11-16 03:06:06,810 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:7003-0x133a5bab186271f Successfully transitioned node 69ab6eb0e2feff1fda52d36d8fa75798 from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENING
{code}
                
> [availability] Give recovered.edits files better names, ones that include first and last sequence id so we can skip files with edits we know older than current region has
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4797
>                 URL: https://issues.apache.org/jira/browse/HBASE-4797
>             Project: HBase
>          Issue Type: Bug
>          Components: performance
>            Reporter: stack
>
> Testing 0.92, I crashed all servers out.  Another bug makes it so WALs are not getting cleaned so I had 7000 regions to replay.  The distributed split code did a nice job and cluster came back but interesting is that some hot regions ended up having loads of recovered.edits files -- tens if not hundreds -- to replay against the region (can we bulk load recovered.edits instead of replaying them?).  Each recovered.edits file is taking about a second to process (though only about 30 odd edits per file it seems).  The region is unavailable during this time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira