You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/06/17 18:46:07 UTC

[jira] Created: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
----------------------------------------------------------------------

                 Key: HBASE-1534
                 URL: https://issues.apache.org/jira/browse/HBASE-1534
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack
             Fix For: 0.20.0


We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.

{code}
2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
java.lang.NullPointerException
    at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
    at java.lang.Thread.run(Thread.java:619)
2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
java.io.IOException: Region server startup failed
    at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
    at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.NullPointerException
    at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
    ... 2 more   
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1534.
--------------------------

    Resolution: Fixed

THanks for review Nitay.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: 1534-redux-v2.patch, 1534-redux.patch, hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1534:
-------------------------

    Attachment: 1534-redux.patch

What you think of this Nitay?

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: 1534-redux.patch, hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reopened HBASE-1534:
--------------------------


Just saw this on jgray cluster.  Looking at reportForDuty, looks like might be able to fall through reportForDuty method and out into the init (if stopRequested had not been reset say?).

Here is jgray log:

{code}
2009-07-29 08:12:25,630 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
2009-07-29 08:12:25,630 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: globalMemStoreLimit=597.5m, globalMemStoreLimitLowMark=298.8m, maxHeap=1.9g
2009-07-29 08:12:25,631 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
2009-07-29 08:12:25,637 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, host=zk5:2181,zk4:2181,zk3:2181,zk2:2181,zk6:2181 sessionTimeout=60000 watcher=org.apache.hadoop.hbase.regionserver.HRegionServer@2c96cf11
2009-07-29 08:12:25,638 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server zk3/XX.XX.XX.143:2181
2009-07-29 08:12:25,641 INFO org.apache.zookeeper.ClientCnxn: Priming connection to java.nio.channels.SocketChannel[connected local=/XX.XX.XX.217:49697 remote=zk3/XX.XX.XX.143:2181]
2009-07-29 08:12:25,641 INFO org.apache.zookeeper.ClientCnxn: Server connection successful
2009-07-29 08:12:25,650 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
2009-07-29 08:12:25,650 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
java.lang.NullPointerException
	at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:705)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:420)
	at java.lang.Thread.run(Thread.java:636)
2009-07-29 08:12:25,651 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
java.io.IOException: Region server startup failed
	at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:831)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:747)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:420)
	at java.lang.Thread.run(Thread.java:636)
Caused by: java.lang.NullPointerException
	at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:705)
	... 2 more

{code}

Scenario was expired zk session.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1534:
-------------------------------

    Attachment: hbase-1534.patch

How about this Stack?

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720833#action_12720833 ] 

Nitay Joffe commented on HBASE-1534:
------------------------------------

This is a problem because init is assuming what it takes in will never be null, yet reportForDuty can return null if it's unable to getMaster().
Seems to me like the initial reportForDuty should be in a loop until it returns something non-null?

What do you think?

I can take this on if you like.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.0
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722025#action_12722025 ] 

stack commented on HBASE-1534:
------------------------------

This is bad name for a variable: getMasterLogTime.  It looks like the name of a getter method.

You are doing sleeper.sleep.  How long does that sleep for?  Is it < 5000?  If not, just log every time through?  I'd say log every minute, rather than every 5 seconds.

Otherwise, looks good N.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-1534:
----------------------------

    Assignee: Nitay Joffe

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1534:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720761#action_12720761 ] 

stack commented on HBASE-1534:
------------------------------

Nitay or J-D, can zk in its log note expirations?

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.0
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720932#action_12720932 ] 

stack commented on HBASE-1534:
------------------------------

Loop and pause, yeah.... perhaps log every so often.. say once every 5 mins to viewing logs its obvious where its at.

Thanks for taking it Nitay... let me give it to you.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.0
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1534:
-------------------------------

    Status: Patch Available  (was: In Progress)

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736883#action_12736883 ] 

Nitay Joffe commented on HBASE-1534:
------------------------------------

+1 LG. Didn't try the patch, just looked over the logic.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: 1534-redux-v2.patch, 1534-redux.patch, hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736877#action_12736877 ] 

Nitay Joffe commented on HBASE-1534:
------------------------------------

Wouldn't the NPE still occur, even with this patch? 

{noformat}
-      init(reportForDuty());
+      MapWritable w = null;
+      while (!stopRequested.get()) {
+        w = reportForDuty();
+        if (w != null) break;
+        sleeper.sleep();
+        LOG.warn("No response from master on reportForDuty. Sleeping and " +
+          "then trying again.");
+      }
+      init(w);
{noformat}

Say stopRequested is true. Then we end up calling init(w) with w = null, which leads to the NPE?

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: 1534-redux.patch, hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Work started: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HBASE-1534 started by Nitay Joffe.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1534:
-------------------------

    Attachment: 1534-redux-v2.patch

Address NPE issue raised by Nitay on IRC

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: 1534-redux-v2.patch, 1534-redux.patch, hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722038#action_12722038 ] 

stack commented on HBASE-1534:
------------------------------

+1

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1534) Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nitay Joffe updated HBASE-1534:
-------------------------------

    Attachment: hbase-1534-v2.patch

Oops that should have been 5 million, i.e. 5 mins.

Anyways, good point. The sleeper is initialized to 3 minutes, so I think that's good enough, and we can just log on each finished sleep as you say.

> Got ZooKeeper event, state: Disconnected on HRS and then NPE on reinit
> ----------------------------------------------------------------------
>
>                 Key: HBASE-1534
>                 URL: https://issues.apache.org/jira/browse/HBASE-1534
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>         Attachments: hbase-1534-v2.patch, hbase-1534.patch
>
>
> We got disconnect from zk but then when we tried to reinitialize ourselves, got a NPE.  See below.
> {code}
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread. 
> 2009-06-17 11:58:55,102 [Thread-16] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> 2009-06-17 11:58:55,102 [main-EventThread] INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics: Initializing RPC Metrics with hostName=HRegionServer, port=60021
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: globalMemcacheLimit=556.7m, globalMemcacheLimitLowMark=347.9m, maxHeap=1.4g
> 2009-06-17 11:58:55,103 [main-EventThread] INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
> 2009-06-17 11:58:55,148 [regionserver/0:0:0:0:0:0:0:0:60021] ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
> java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> 2009-06-17 11:58:55,153 [regionserver/0:0:0:0:0:0:0:0:60021] FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception. Aborting...
> java.io.IOException: Region server startup failed
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:832)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:751)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:431)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:713)
>     ... 2 more   
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.