You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (Updated) (JIRA)" <ji...@apache.org> on 2011/10/01 00:39:45 UTC
[jira] [Updated] (HBASE-4473) NPE when executors are down but events are still coming in

     [ https://issues.apache.org/jira/browse/HBASE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-4473:
--------------------------------------

    Attachment: HBASE-4473.patch
                HBASE-4473-0.90.patch

Attaching patches for all branches, only difference is that in 0.90 we can't check for the size of the executors map.

The NPE is now avoided by simply dropping the event and doing additional logging. Added a test.
                
> NPE when executors are down but events are still coming in
> ----------------------------------------------------------
>
>                 Key: HBASE-4473
>                 URL: https://issues.apache.org/jira/browse/HBASE-4473
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.4
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Minor
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4473-0.90.patch, HBASE-4473.patch
>
>
> Minor annoyance when shutting down a cluster and the master is still receiving events from Zookeeper:
> {quote}
> 2011-09-22 23:53:01,552 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x3292d87deb004f Received InterruptedException, doing nothing here
> java.lang.InterruptedException
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:485)
>         at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1317)
>         at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:726)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:938)
>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:407)
>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteOpenedNode(ZKAssign.java:284)
>         at org.apache.hadoop.hbase.master.handler.OpenedRegionHandler.process(OpenedRegionHandler.java:88)
>         at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:156)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> ...
> 2011-09-22 23:53:01,558 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Executor service [MASTER_OPEN_REGION-sv2borg170:60000] not found in {}
> 2011-09-22 23:53:01,558 ERROR org.apache.zookeeper.ClientCnxn: Error while calling watcher
> java.lang.NullPointerException
>         at org.apache.hadoop.hbase.executor.ExecutorService.submit(ExecutorService.java:220)
>         at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:447)
>         at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:546)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:281)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
> {quote}
> It's annoying because it then spams you with a bunch of NPEs that have nothing to do with the reason the Master is shutting down. Googling I saw someone also had that issue in June: http://pastebin.com/5Tqrj0nq

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira