You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sylvain Veyrié (JIRA)" <ji...@apache.org> on 2019/07/02 12:35:00 UTC

[jira] [Updated] (HBASE-22650) NPE in AssignmentManager (master crash on startup)

     [ https://issues.apache.org/jira/browse/HBASE-22650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Veyrié updated HBASE-22650:
-----------------------------------
    Description: 
On HMaster Startup:

 
{quote}2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Failed to become active master
 java.lang.NullPointerException
     at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
     at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:964)
     at java.util.concurrent.ConcurrentHashMap$KeySetView.contains(ConcurrentHashMap.java:4558)
     at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032)
     at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:3094)
     at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:495)
     at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:830)
     at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:202)
     at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1883)
     at java.lang.Thread.run(Thread.java:748)
 2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Master server abort: loaded coprocessors are: []
 2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
 java.lang.NullPointerException
     at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
     at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:964)
     at java.util.concurrent.ConcurrentHashMap$KeySetView.contains(ConcurrentHashMap.java:4558)
     at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032)
     at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:3094)
     at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:495)
     at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:830)
     at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:202)
     at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1883)
     at java.lang.Thread.run(Thread.java:748)
{quote}
It happens when regionLocation is null, which may happen just above on line 3086 (or as returned by getRegionServer)

We had this on 1.2.12 with the corresponding patch, but since it is not supported anymore, did not submit it.

Attached is the patch for 1.3.5. Did not test it in 1.4+

 

 

 

  was:
On HMaster Startup:

 
{quote}2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Failed to become active master
java.lang.NullPointerException
    at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
    at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:964)
    at java.util.concurrent.ConcurrentHashMap$KeySetView.contains(ConcurrentHashMap.java:4558)
    at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032)
    at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:3094)
    at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:495)
    at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:830)
    at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:202)
    at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1883)
    at java.lang.Thread.run(Thread.java:748)
2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Master server abort: loaded coprocessors are: []
2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.lang.NullPointerException
    at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
    at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:964)
    at java.util.concurrent.ConcurrentHashMap$KeySetView.contains(ConcurrentHashMap.java:4558)
    at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032)
    at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:3094)
    at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:495)
    at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:830)
    at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:202)
    at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1883)
    at java.lang.Thread.run(Thread.java:748)
{quote}
It happens when regionLocation is null, which may happen just above on line 3086 (or as returned by getRegionServer

We had this on 1.2.12 with the corresponding patch, but since it is not supported anymore, did not submit it.

Attached is the patch for 1.3.5.

 

 

 


> NPE in AssignmentManager (master crash on startup)
> --------------------------------------------------
>
>                 Key: HBASE-22650
>                 URL: https://issues.apache.org/jira/browse/HBASE-22650
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.2.12, 1.3.5
>            Reporter: Sylvain Veyrié
>            Priority: Critical
>              Labels: patch
>         Attachments: AssignmentManager-NPE.patch
>
>
> On HMaster Startup:
>  
> {quote}2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Failed to become active master
>  java.lang.NullPointerException
>      at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>      at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:964)
>      at java.util.concurrent.ConcurrentHashMap$KeySetView.contains(ConcurrentHashMap.java:4558)
>      at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032)
>      at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:3094)
>      at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:495)
>      at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:830)
>      at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:202)
>      at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1883)
>      at java.lang.Thread.run(Thread.java:748)
>  2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Master server abort: loaded coprocessors are: []
>  2019-07-02 12:38:11,312 FATAL [orc3:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
>  java.lang.NullPointerException
>      at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>      at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:964)
>      at java.util.concurrent.ConcurrentHashMap$KeySetView.contains(ConcurrentHashMap.java:4558)
>      at java.util.Collections$UnmodifiableCollection.contains(Collections.java:1032)
>      at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:3094)
>      at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:495)
>      at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:830)
>      at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:202)
>      at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1883)
>      at java.lang.Thread.run(Thread.java:748)
> {quote}
> It happens when regionLocation is null, which may happen just above on line 3086 (or as returned by getRegionServer)
> We had this on 1.2.12 with the corresponding patch, but since it is not supported anymore, did not submit it.
> Attached is the patch for 1.3.5. Did not test it in 1.4+
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)