You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Szehon Ho (Jira)" <ji...@apache.org> on 2021/09/15 01:18:00 UTC

[jira] [Commented] (HIVE-25522) NullPointerException in TxnHandler

    [ https://issues.apache.org/jira/browse/HIVE-25522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415278#comment-17415278 ] 

Szehon Ho commented on HIVE-25522:
----------------------------------

[~csun] helped me to reason through the code, thanks!

It looks like the TxnHandler's initialization block sets a lot of static variables llke connPool, sqlGenerator, etc..  But it skips it if only connPool is set.  So it's possible that connPool was successfully set but the thread crashed and did not set the other variables.  Ref:  https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L372

It looks like TxnHandler's are thread local and are constructed on demand, so a fatal error in one thread does not kill the HMS server.

> NullPointerException in TxnHandler
> ----------------------------------
>
>                 Key: HIVE-25522
>                 URL: https://issues.apache.org/jira/browse/HIVE-25522
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>    Affects Versions: 3.1.2
>            Reporter: Szehon Ho
>            Priority: Major
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] metastore.RetryingHMSHandler: java.lang.NullPointerException
> 	at org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
> 	at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
> 	at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> 	at com.sun.proxy.$Proxy27.lock(Unknown Source)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> 	at java.base/java.security.AccessController.doPrivileged(Native Method)
> 	at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> 	at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer: Error occurred during processing of message.
> java.lang.NullPointerException: null
> 	at org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) ~[?:?]
> 	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
> 	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at com.sun.proxy.$Proxy27.lock(Unknown Source) ~[?:?]
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
> 	at javax.security.auth.Subject.doAs(Subject.java:423) ~[?:?]
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) ~[hadoop-common-3.1.4.jar:?]
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) ~[hive-exec-3.1.2.jar:3.1.2]
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.2.jar:3.1.2]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
> 	at java.lang.Thread.run(Thread.java:834) [?:?]
> {noformat}
> It seems it's this line, though root cause is not deterined.
> https://github.com/apache/hive/blob/rel/release-3.1.2/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L1903



--
This message was sent by Atlassian Jira
(v8.3.4#803005)