You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vikram Ahuja (Jira)" <ji...@apache.org> on 2024/01/30 05:46:00 UTC

[jira] [Comment Edited] (HIVE-28042) DigestMD5 error during opening connection to HMS

    [ https://issues.apache.org/jira/browse/HIVE-28042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812151#comment-17812151 ] 

Vikram Ahuja edited comment on HIVE-28042 at 1/30/24 5:45 AM:
--------------------------------------------------------------

*Another instance of this issue:*

 
{code:java}
2024-01-24T02:11:21,324 ERROR [TThreadPoolServer WorkerProcess-760394]: transport.TSaslTransport (TSaslTransport.java:open) 
- SASL negotiation failurejavax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password
at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)        
at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java)        
at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java)        
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java)        
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java)        
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)        
at java.security.AccessController.doPrivileged(Native Method)        
at javax.security.auth.Subject.doAs(Subject.javA)        
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java)        
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java)        
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java)        
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)        
at java.lang.Thread.run(Thread.java)Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: HIVE_DELEGATION_TOKEN owner=***, renewer=***, realUser=*****************, issueDate=1705973286139, maxDate=1706578086139, sequenceNumber=3294063, masterKeyId=7601        
at org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)        
at org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java)        at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)        ... 15 more {code}
 

 

*Analysis of the issue:*

This particular issue is only happening when the HS2 tries to open a new Digest MD5 based Thrift TSaslClientTransport in cases where the session is open for a long time.

HS2 uses the same metaStoreClient object across all the connections that is embedded in Hive.java but in some cases we have observed that is recreating a new metaStoreClient with a fresh connection(TSaslClientTransport). Two use cases that I discovered which were leading to these issues were:
 # 
 ## MSCK repair
 ## RetryingMetaStoreClient in case of any HMS issues(applicable to any sql query which interacts with the HMS)

 

*Root cause of this issue:*

There is a background thread called ExpiredTokenRemover running in HMS (class: TokenStoreDelegationTokenSecretManager.java ). This expiry thread itself is removing the token from the tokenStore after the renewal time has passed and also removing it after expiry time, but is should only remove it post expiry time as the token can be renewed till then.

 

Will be raising a fix for the same by changing the code where token is deleted after renewal time itself has passed.


was (Author: vikramahuja_):
*Another instance of this issue:*

 
{code:java}
2024-01-24T02:11:21,324 ERROR [TThreadPoolServer WorkerProcess-760394]: transport.TSaslTransport (TSaslTransport.java:open) 
- SASL negotiation failurejavax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password
at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)        
at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java)        
at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java)        
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java)        
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java)        
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)        
at java.security.AccessController.doPrivileged(Native Method)        
at javax.security.auth.Subject.doAs(Subject.javA)        
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java)        
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java)        
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java)        
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)        
at java.lang.Thread.run(Thread.java)Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: HIVE_DELEGATION_TOKEN owner=***, renewer=***, realUser=*****************, issueDate=1705973286139, maxDate=1706578086139, sequenceNumber=3294063, masterKeyId=7601        
at org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)        
at org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java)        
at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java)        at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)        ... 15 more {code}
 

 

*Analysis of the issue:*

This particular issue is only happening when the HS2 tries to open a new Digest MD5 based Thrift TSaslClientTransport in cases where the session is one for a long time.

HS2 uses the same metaStoreClient object across all the connections that is embedded in Hive.java but in some cases we have observed that is recreating a new metaStoreClient with a fresh connection(TSaslClientTransport). Two use cases that I discovered which were leading to these issues were:
 # 
 ## MSCK repair
 ## RetryingMetaStoreClient in case of any HMS issues(applicable to any sql query which interacts with the HMS)

 

*Root cause of this issue:*

There is a background thread called ExpiredTokenRemover running in HMS (class:  TokenStoreDelegationTokenSecretManager.java ). This expiry thread itself is removing the token from the tokenStore after the renewal time has passed and also removing it after expiry time, but is should only remove it post expiry time as the token can be renewed till then.

 

Will be raising a fix for the same by changing the code where token is deleted after renewal time itself has passed.

> DigestMD5 error during opening connection to HMS
> ------------------------------------------------
>
>                 Key: HIVE-28042
>                 URL: https://issues.apache.org/jira/browse/HIVE-28042
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Vikram Ahuja
>            Assignee: Vikram Ahuja
>            Priority: Major
>
> Hello,
> In our deployment we are facing the following exception in the HMS logs when a HMS connection is opened from the HS2 in cases where a session is open for a long time leading to query failures:
> {code:java}
> 2024-01-24T02:11:21,324 ERROR [TThreadPoolServer WorkerProcess-760394]: transport.TSaslTransport (TSaslTransport.java:open) - SASL negotiation failurejavax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password        
> at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)        
> at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java)        
> at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java)        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java)        
> at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java)        
> at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java)        
> at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)        
> at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)        
> at java.security.AccessController.doPrivileged(Native Method)        
> at javax.security.auth.Subject.doAs(Subject.javA)        
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java)        
> at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java)        
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java)        
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java)        
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)        
> at java.lang.Thread.run(Thread.java)Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: HIVE_DELEGATION_TOKEN owner=***, renewer=***, realUser=*****************, issueDate=1705973286139, maxDate=1706578086139, sequenceNumber=3294063, masterKeyId=7601        
> at org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)        
> at org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)        
> at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java)        
> at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java)        
> at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)        ... 15 more {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)