You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chunhui shen (Commented) (JIRA)" <ji...@apache.org> on 2011/12/09 04:09:43 UTC

[jira] [Commented] (HBASE-4988) MetaServer crash cause all splitting regionserver abort

    [ https://issues.apache.org/jira/browse/HBASE-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165805#comment-13165805 ] 

chunhui shen commented on HBASE-4988:
-------------------------------------

logs
{code}
2011-12-07 17:49:17,737 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined writetest,28TPVACCO3EI47TH472E1997TX1ZDFQ7XUCMBA2LUKOD7G0U3NQ2L2FG0ILRGZ5ETHFESE5QIMFN8ONUDUXB80G7MEK58G7YM4EG,1323251351741.6399c204b8d45568a782fd0157d6700d.; next sequenceid=3483318538 
2011-12-07 17:49:17,737 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for writetest,2FB\xC0EE\xC2LDFG\xC8\xB6GV\xCE\xC6F4<\xBBE\xC87BM\xC0\xCD\xC3\xC8A\xB3\xCE\xD5G\xCBI\xBA\xBB\xCB\xD7R\xD2=\xC5>2U;P\xD2D\xCD\xBA\xC6\xC6A\xC1KI\xCDND\xC8\xCEKG\xC3\xCC\xCD\xB4\xC1=\xD0\xC4\xD2FSSPE\xD0V\xCE5@\xBCCN\xC4\xCB\xBE7L\xC8E\xC1\xBD\xCFH,1323251351741.a639e2eda8b2de9ca368c1a13ebbcb44. because Region has references on open; priority=16, compaction queue size=1 
2011-12-07 17:49:17,737 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of .META.,,1 at address=dw83.kgb.sqa.cm4:60020; java.io.EOFException 
2011-12-07 17:49:17,737 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Current cached META location is not valid, resetting 
2011-12-07 17:49:17,740 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of .META.,,1 at address=dw83.kgb.sqa.cm4:60020; java.net.ConnectException: Connection refused 
2011-12-07 17:49:17,740 WARN org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running rollback of failed split of writetest,28TPVACCO3EI47TH472E1997TX1ZDFQ7XUCMBA2LUKOD7G0U3NQ2L2FG0ILRGZ5ETHFESE5QIMFN8ONUDUXB80G7MEK58G7YM4EG,1323240352298.c7bde4437e5b12bc7226485dcbc2700b.; Timed out (2147483647ms) 
2011-12-07 17:49:17,740 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server serverName=dw87.kgb.sqa.cm4,60020,1323244700069, load=(requests=393, regions=12, usedHeap=742, maxHeap=15872): Abort; we got an error after point-of-no-return 
{code}
                
> MetaServer crash cause all splitting regionserver abort
> -------------------------------------------------------
>
>                 Key: HBASE-4988
>                 URL: https://issues.apache.org/jira/browse/HBASE-4988
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>
> If metaserver crash now,
> All the splitting regionserver will abort theirself.
> Becasue the code
> {code}
> this.journal.add(JournalEntry.PONR);
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>             this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> {code}
> If the JournalEntry is PONR, split's roll back will abort itselef.
> It is terrible in huge putting environment when metaserver crash

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira