You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/11/10 06:36:15 UTC

[jira] Resolved: (HBASE-3122) NPE in master.AssignmentManager if all region servers shut down

     [ https://issues.apache.org/jira/browse/HBASE-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3122.
--------------------------

    Resolution: Invalid

Resolving as no longer valid.  Looks like this is fixed.  I tested killing single regionserver multiple times.  -ROOT- and .META. hang out in RIT.. cycling, timing out waiting on someone to assign too... then if you start up an RS, it just goes ahead and assigns them... Here's log snippets:

{code}
REGION_OPENED
2010-11-10 05:28:11,038 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x12c343c19d40000 Successfully deleted unassigned node for region 686d4b2520f56c32877d51bcc08b355a in expected state RS_ZK_REGION_OPENED
2010-11-10 05:28:11,039 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region usertable,user992934061,1288992609496.686d4b2520f56c32877d51bcc08b355a. on sv2borg180,60020,1289366835169






2010-11-10 05:31:33,027 INFO org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [sv2borg180,60020,1289366835169]
2010-11-10 05:31:33,028 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=sv2borg180,60020,1289366835169 to dead servers, submitted shutdown handler to be executed, root=true, meta=true
2010-11-10 05:31:33,028 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs for sv2borg180,60020,1289366835169
2010-11-10 05:31:33,031 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting 1 hlog(s) in hdfs://sv2borg180:20000/hbase/.logs/sv2borg180,60020,1289366835169
2010-11-10 05:31:33,031 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog 1 of 1: hdfs://sv2borg180:20000/hbase/.logs/sv2borg180,60020,1289366835169/10.20.20.180%3A60020.1289366835754, length=0

....

2010-11-10 05:31:47,409 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Moved 0 log files to /hbase/.oldlogs
2010-11-10 05:31:47,433 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: hlog file splitting completed in 14404 ms for hdfs://sv2borg180:20000/hbase/.logs/sv2borg180,60020,1289366835169
2010-11-10 05:31:47,443 INFO org.apache.hadoop.hbase.catalog.RootLocationEditor: Unsetting ROOT region location in ZooKeeper
2010-11-10 05:31:47,466 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x12c343c19d40000 Creating (or updating) unassigned node for 70236052 with OFFLINE state
2010-11-10 05:31:47,492 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x12c343c19d40000 Creating (or updating) unassigned node for 1028785192 with OFFLINE state
2010-11-10 05:31:47,517 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=sv2borg180:60000, region=1028785192/.META.
2010-11-10 05:31:47,517 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of .META.,,1 at address=sv2borg180:60020; java.net.ConnectException: Connection refused
2010-11-10 05:31:47,517 INFO org.apache.hadoop.hbase.catalog.CatalogTracker: Current cached META location is not valid, resetting
2010-11-10 05:32:23,752 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  .META.,,1.1028785192 state=OFFLINE, ts=1289367107492
2010-11-10 05:32:23,752 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning .META.,,1.1028785192 to a random server
2010-11-10 05:32:23,752 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1289367107492
2010-11-10 05:32:23,752 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  -ROOT-,,0.70236052 state=OFFLINE, ts=1289367107466
2010-11-10 05:32:23,752 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server
2010-11-10 05:32:23,752 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1289367107466
....


2010-11-10 05:33:16,718 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv2borg180,60020,1289367196372, regionCount=0, userLoad=false
2010-11-10 05:33:23,753 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  .META.,,1.1028785192 state=OFFLINE, ts=1289367173753
2010-11-10 05:33:23,753 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning .META.,,1.1028785192 to a random server
2010-11-10 05:33:23,753 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1289367173753
2010-11-10 05:33:23,754 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for .META.,,1.1028785192 so generated a random one; hri=.META.,,1.1028785192, src=, dest=sv2borg180,60020,1289367196372; 1 (online=1, exclude=null) available servers
2010-11-10 05:33:23,754 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region .META.,,1.1028785192 to sv2borg180,60020,1289367196372
2010-11-10 05:33:23,754 DEBUG org.apache.hadoop.hbase.master.ServerManager: New connection to sv2borg180,60020,1289367196372
2010-11-10 05:33:23,802 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=sv2borg180,60020,1289367196372, region=1028785192/.META.
2010-11-10 05:33:24,202 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=sv2borg180,60020,1289367196372, region=1028785192/.META.
2010-11-10 05:33:33,754 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  -ROOT-,,0.70236052 state=OFFLINE, ts=1289367173754
2010-11-10 05:33:33,754 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server
2010-11-10 05:33:33,754 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1289367173754
2010-11-10 05:33:33,754 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for -ROOT-,,0.70236052 so generated a random one;   hri=-ROOT-,,0.70236052, src=, dest=sv2borg180,60020,1289367196372; 1 (online=1, exclude=null) available servers
2010-11-10 05:33:33,754 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region -ROOT-,,0.70236052 to sv2borg180,60020,1289367196372
....
{code}

After all of the above done multiple times, hbck reports alls well.

> NPE in master.AssignmentManager if all region servers shut down
> ---------------------------------------------------------------
>
>                 Key: HBASE-3122
>                 URL: https://issues.apache.org/jira/browse/HBASE-3122
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Andrew Purtell
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.0
>
>
> 10/10/18 16:26:44 INFO catalog.CatalogTracker: acer,60020,1287443908850 carrying .META.; unsetting .META. location
> 10/10/18 16:26:44 INFO catalog.CatalogTracker: Current cached META location is not valid, resetting
> 10/10/18 16:26:44 INFO handler.ServerShutdownHandler: Splitting logs for acer,60020,1287443908850
> 10/10/18 16:26:44 INFO zookeeper.ZKUtil: hconnection-0x12bc1a2f0a60001 Set watcher on existing znode /hbase/root-region-server
> 10/10/18 16:26:44 INFO catalog.RootLocationEditor: Unsetting ROOT region location in ZooKeeper
> 10/10/18 16:26:44 DEBUG zookeeper.ZKAssign: master:60000-0x12bc1a2f0a60000 Creating (or updating) unassigned node for 70236052 with OFFLINE state
> 10/10/18 16:26:44 WARN master.LoadBalancer: Wanted to do random assignment but no servers to assign to
> 10/10/18 16:26:44 ERROR executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
> java.lang.NullPointerException
> 	at org.apache.hadoop.hbase.master.LoadBalancer$RegionPlan.toString(LoadBalancer.java:595)
> 	at java.lang.String.valueOf(String.java:2826)
> 	at java.lang.StringBuilder.append(StringBuilder.java:115)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:803)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:777)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:720)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:640)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:922)
> 	at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:97)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:150)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.