You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by sagar naik <sn...@attributor.com> on 2011/12/05 19:45:02 UTC

Regions failed to migrate

Scenario:
  -  regionserver machine was rebooted . AWS random reboot .
Regionserver logs show shutdown

  - Datanode which is on same machine also recvd a kill command

  - Master did not migrate  the regions. It did detect the node down

  - I checked after 6 hours , the hbase was in inconsistent state . (hbck)

 - When I restarted the regionserver it was giving me a UnknownScannerException

 - I ended up, restart master only and then regionserver would start fine

 - After about 1 hr of regionserver going down, major compaction
(croned) kicked in

Question:

    Why did regions on the regionserver did not migrate ? am I missing
something, some config params.
    Most of the config is default except for compaction interval

Thanks


REGIONSERVER LOGS:

2011-12-04 22:19:09,586 INFO
org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file
/user/nileus/hbase-storage/.logs/ip-X-X-X-X,60020,1322010465849/ip-10-174-43-151.us-west-1.compute.internal%3A60020.1322967102356
whose highest sequenceid is 127636589 to
/user/nileus/hbase-storage/.oldlogs/ip-X-X-x-X%3A60020.1322967102356
2011-12-04 22:43:47,291 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
starting; hbase.shutdown.hook=true;
fsShutdownHook=Thread[Thread-15,5,main]
2011-12-04 22:43:47,291 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
hook


MASTER LOG:
2011-12-04 22:46:39,133 INFO
org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer
ephemeral node deleted, processing expiration
[datanode001,60020,1322010465849]
2011-12-04 22:46:39,133 INFO
org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo
found for datanode001,60020,1322010465849
2011-12-04 22:47:20,828 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
2011-12-04 22:52:20,893 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
2011-12-04 22:57:20,959 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
2


POST RESTART REGIONSERVER LOGS

2011-12-05 07:21:04,589 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: -1
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
2011-12-05 07:21:14,469 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.UnknownScannerException: Name: -1
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
Mon Dec  5 07:21:25 PST 2011 Killing regionserver
2011-12-05 07:21:25,647 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
starting; hbase.shutdown.hook=true;
fsShutdownHook=Thread[Thread-15,5,main]
2011-12-05 07:21:25,647 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
hook

Re: Regions failed to migrate

Posted by Shrijeet Paliwal <sh...@rocketfuel.com>.
What J-D said or may be https://issues.apache.org/jira/browse/HBASE-4109 ?

On Tue, Dec 6, 2011 at 10:01 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> Your DNS is setup wrong:
>
> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo
> found for datanode001,60020,1322010465849
>
> That's what that line means. The master did detect that a region
> server went down, but the name was unknown to it.
>
> J-D
>
> On Mon, Dec 5, 2011 at 10:45 AM, sagar naik <sn...@attributor.com> wrote:
> > Scenario:
> >  -  regionserver machine was rebooted . AWS random reboot .
> > Regionserver logs show shutdown
> >
> >  - Datanode which is on same machine also recvd a kill command
> >
> >  - Master did not migrate  the regions. It did detect the node down
> >
> >  - I checked after 6 hours , the hbase was in inconsistent state . (hbck)
> >
> >  - When I restarted the regionserver it was giving me a
> UnknownScannerException
> >
> >  - I ended up, restart master only and then regionserver would start fine
> >
> >  - After about 1 hr of regionserver going down, major compaction
> > (croned) kicked in
> >
> > Question:
> >
> >    Why did regions on the regionserver did not migrate ? am I missing
> > something, some config params.
> >    Most of the config is default except for compaction interval
> >
> > Thanks
> >
> >
> > REGIONSERVER LOGS:
> >
> > 2011-12-04 22:19:09,586 INFO
> > org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file
> >
> /user/nileus/hbase-storage/.logs/ip-X-X-X-X,60020,1322010465849/ip-10-174-43-151.us-west-1.compute.internal%3A60020.1322967102356
> > whose highest sequenceid is 127636589 to
> > /user/nileus/hbase-storage/.oldlogs/ip-X-X-x-X%3A60020.1322967102356
> > 2011-12-04 22:43:47,291 INFO
> > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
> > starting; hbase.shutdown.hook=true;
> > fsShutdownHook=Thread[Thread-15,5,main]
> > 2011-12-04 22:43:47,291 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
> > hook
> >
> >
> > MASTER LOG:
> > 2011-12-04 22:46:39,133 INFO
> > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer
> > ephemeral node deleted, processing expiration
> > [datanode001,60020,1322010465849]
> > 2011-12-04 22:46:39,133 INFO
> > org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo
> > found for datanode001,60020,1322010465849
> > 2011-12-04 22:47:20,828 INFO
> > org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
> > servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
> > 2011-12-04 22:52:20,893 INFO
> > org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
> > servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
> > 2011-12-04 22:57:20,959 INFO
> > org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
> > servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
> > 2
> >
> >
> > POST RESTART REGIONSERVER LOGS
> >
> > 2011-12-05 07:21:04,589 ERROR
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > org.apache.hadoop.hbase.UnknownScannerException: Name: -1
> >        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> >        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> > 2011-12-05 07:21:14,469 ERROR
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > org.apache.hadoop.hbase.UnknownScannerException: Name: -1
> >        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> >        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> > Mon Dec  5 07:21:25 PST 2011 Killing regionserver
> > 2011-12-05 07:21:25,647 INFO
> > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
> > starting; hbase.shutdown.hook=true;
> > fsShutdownHook=Thread[Thread-15,5,main]
> > 2011-12-05 07:21:25,647 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
> > hook
>

Re: Regions failed to migrate

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Your DNS is setup wrong:

org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo
found for datanode001,60020,1322010465849

That's what that line means. The master did detect that a region
server went down, but the name was unknown to it.

J-D

On Mon, Dec 5, 2011 at 10:45 AM, sagar naik <sn...@attributor.com> wrote:
> Scenario:
>  -  regionserver machine was rebooted . AWS random reboot .
> Regionserver logs show shutdown
>
>  - Datanode which is on same machine also recvd a kill command
>
>  - Master did not migrate  the regions. It did detect the node down
>
>  - I checked after 6 hours , the hbase was in inconsistent state . (hbck)
>
>  - When I restarted the regionserver it was giving me a UnknownScannerException
>
>  - I ended up, restart master only and then regionserver would start fine
>
>  - After about 1 hr of regionserver going down, major compaction
> (croned) kicked in
>
> Question:
>
>    Why did regions on the regionserver did not migrate ? am I missing
> something, some config params.
>    Most of the config is default except for compaction interval
>
> Thanks
>
>
> REGIONSERVER LOGS:
>
> 2011-12-04 22:19:09,586 INFO
> org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file
> /user/nileus/hbase-storage/.logs/ip-X-X-X-X,60020,1322010465849/ip-10-174-43-151.us-west-1.compute.internal%3A60020.1322967102356
> whose highest sequenceid is 127636589 to
> /user/nileus/hbase-storage/.oldlogs/ip-X-X-x-X%3A60020.1322967102356
> 2011-12-04 22:43:47,291 INFO
> org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
> starting; hbase.shutdown.hook=true;
> fsShutdownHook=Thread[Thread-15,5,main]
> 2011-12-04 22:43:47,291 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
> hook
>
>
> MASTER LOG:
> 2011-12-04 22:46:39,133 INFO
> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer
> ephemeral node deleted, processing expiration
> [datanode001,60020,1322010465849]
> 2011-12-04 22:46:39,133 INFO
> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo
> found for datanode001,60020,1322010465849
> 2011-12-04 22:47:20,828 INFO
> org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
> servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
> 2011-12-04 22:52:20,893 INFO
> org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
> servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
> 2011-12-04 22:57:20,959 INFO
> org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
> servers=3 regions=421 average=140.33333 mostloaded=141 leastloaded=141
> 2
>
>
> POST RESTART REGIONSERVER LOGS
>
> 2011-12-05 07:21:04,589 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -1
>        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> 2011-12-05 07:21:14,469 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.UnknownScannerException: Name: -1
>        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1809)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
>        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> Mon Dec  5 07:21:25 PST 2011 Killing regionserver
> 2011-12-05 07:21:25,647 INFO
> org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
> starting; hbase.shutdown.hook=true;
> fsShutdownHook=Thread[Thread-15,5,main]
> 2011-12-05 07:21:25,647 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown
> hook