You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2018/02/20 18:51:00 UTC

[jira] [Created] (HBASE-20028) NPE when comparing versions in AM after RS ZK expiration

Josh Elser created HBASE-20028:
----------------------------------

             Summary: NPE when comparing versions in AM after RS ZK expiration
                 Key: HBASE-20028
                 URL: https://issues.apache.org/jira/browse/HBASE-20028
             Project: HBase
          Issue Type: Bug
          Components: master
            Reporter: Josh Elser
            Assignee: Josh Elser
             Fix For: 2.0.0-beta-2


{noformat}
2018-02-20 16:36:41,794 ERROR [Thread-85] assignment.AssignmentManager: java.lang.NullPointerException
java.lang.NullPointerException
	at org.apache.hadoop.hbase.util.VersionInfo.compareVersion(VersionInfo.java:122)
	at org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$getExcludedServersForSystemTable$5(AssignmentManager.java:1860)
	at java.util.Collections.max(Collections.java:712)
	at org.apache.hadoop.hbase.master.assignment.AssignmentManager.getExcludedServersForSystemTable(AssignmentManager.java:1859)
	at org.apache.hadoop.hbase.master.assignment.AssignmentManager.lambda$checkIfShouldMoveSystemRegionAsync$0(AssignmentManager.java:464){noformat}
Looks like a race condition around an RS losing its ZK lock. If AM tries to see if it should move a Region to a server who we've seen that the lock was lost but the RS hasn't yet been processed as "dead", we can get into a situation where {{HMaster.getRegionServerVersion()}} returns null and causes this to fail.

Looks like a simple filter on the servers to preclude null versions would fix the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)