You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chunhui shen (Commented) (JIRA)" <ji...@apache.org> on 2012/01/11 07:54:39 UTC

[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss

    [ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183915#comment-13183915 ] 

chunhui shen commented on HBASE-5179:
-------------------------------------

Master logs, Let's see the region a04d0ac0a360e8cf5edf74af4ce64b16.
{code}
2011-12-30 02:20:05,285 INFO org.apache.hadoop.hbase.master.HMaster: Master startup proceeding: master failover 
2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Server start rejected; we already have dw83.kgb.sqa.cm4:60020 registered; existingServer=serverName=dw83.kgb.sqa.cm4,60020,1325180976942, load=(requests=0, regions=7, usedHeap=10831, maxHeap=15872), newServer=serverName=dw83.kgb.sqa.cm4,60020,1325182806080, load=(requests=0, regions=0, usedHeap=230, maxHeap=15872) 
2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Triggering server recovery; existingServer dw83.kgb.sqa.cm4,60020,1325180976942 looks stale 
2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=-ROOT-,,0.70236052 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 
2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=.META.,,1.1028785192 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 
2011-12-30 02:20:06,781 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw83.kgb.sqa.cm4,60020,1325180976942 to dead servers, submitted shutdown handler to be executed, root=false, meta=false 
2011-12-30 02:20:07,839 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Creating writer path=hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/0000000000965355783.temp region=a04d0ac0a360e8cf5edf74af4ce64b16 
2011-12-30 02:20:08,965 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x134784f727b0543 Creating (or updating) unassigned node for a04d0ac0a360e8cf5edf74af4ce64b16 with OFFLINE state 
2011-12-30 02:20:08,988 INFO org.apache.hadoop.hbase.master.AssignmentManager: Failed-over master needs to process 14 regions in transition 
2011-12-30 02:20:09,017 INFO org.apache.hadoop.hbase.master.AssignmentManager: Processing region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. in state M_ZK_REGION_OFFLINE 
2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for a04d0ac0a360e8cf5edf74af4ce64b16 
2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. state=OFFLINE, ts=1325182808966 
2011-12-30 02:20:09,020 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. to dw81.kgb.sqa.cm4,60020,1325181205124 
2011-12-30 02:20:09,365 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. on dw81.kgb.sqa.cm4,60020,1325181205124 
2011-12-30 02:20:20,144 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/0000000000965355783.temp (wrote 146434 edits in 1761ms) 
{code}
                
> Concurrent processing of processFaileOver and ServerShutdownHandler  may cause region is assigned before completing split log, it would cause data loss
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5179
>                 URL: https://issues.apache.org/jira/browse/HBASE-5179
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>
> If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following  case.
> 1.master completing splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira