You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Eric Yang (Jira)" <ji...@apache.org> on 2019/11/06 19:12:00 UTC
[jira] [Created] (YARN-9956) Improve connection error message for
YARN ApiServerClient
Eric Yang created YARN-9956:
-------------------------------
Summary: Improve connection error message for YARN ApiServerClient
Key: YARN-9956
URL: https://issues.apache.org/jira/browse/YARN-9956
Project: Hadoop YARN
Issue Type: Bug
Reporter: Eric Yang
In HA environment, yarn.resourcemanager.webapp.address configuration is optional. ApiServiceClient may produce confusing error message like this:
{code}
19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: host1.example.com:8090
19/10/30 20:13:42 INFO client.ApiServiceClient: Fail to connect to: host2.example.com:8090
19/10/30 20:13:42 INFO util.log: Logging initialized @2301ms
19/10/30 20:13:42 ERROR client.ApiServiceClient: Error: {}
GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)
at java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:771)
at java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:266)
at java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:196)
at org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:125)
at org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:105)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.generateToken(ApiServiceClient.java:105)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:290)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:271)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:416)
at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:589)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:125)
Caused by: KrbException: Server not found in Kerberos database (7) - LOOKING_UP_SERVER
at java.security.jgss/sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
at java.security.jgss/sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
at java.security.jgss/sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
at java.security.jgss/sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
at java.security.jgss/sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
at java.security.jgss/sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
at java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
... 15 more
Caused by: KrbException: Identifier doesn't match expected value (906)
at java.security.jgss/sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
at java.security.jgss/sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
at java.security.jgss/sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
at java.security.jgss/sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
... 21 more
19/10/30 20:13:42 ERROR client.ApiServiceClient: Fail to launch application:
java.io.IOException: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:293)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:271)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.actionLaunch(ApiServiceClient.java:416)
at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:589)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:125)
Caused by: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1894)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.generateToken(ApiServiceClient.java:105)
at org.apache.hadoop.yarn.service.client.ApiServiceClient.getApiClient(ApiServiceClient.java:290)
... 6 more
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)
at org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:135)
at org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:105)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
... 8 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - LOOKING_UP_SERVER)
at java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:771)
at java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:266)
at java.security.jgss/sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:196)
at org.apache.hadoop.yarn.service.client.ApiServiceClient$1.run(ApiServiceClient.java:125)
... 12 more
Caused by: KrbException: Server not found in Kerberos database (7) - LOOKING_UP_SERVER
at java.security.jgss/sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
at java.security.jgss/sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
at java.security.jgss/sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
at java.security.jgss/sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
at java.security.jgss/sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
at java.security.jgss/sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
at java.security.jgss/sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
... 15 more
Caused by: KrbException: Identifier doesn't match expected value (906)
at java.security.jgss/sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
at java.security.jgss/sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
at java.security.jgss/sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
at java.security.jgss/sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
... 21 more
{code}
When getRMWebAddress fail to connect to either resource manager hosts, it will fall back to use the yarn-default.xml value 0.0.0.0, and attempt to acquire TGS for HTTP/0.0.0.0, which produces the error shown here. It would be better to avoid trying to use yarn.resourcemanager.webapp.address as fallback for RM host lookup in HA enabled cluster.
In this particular cluster, contacting to host1.example.com and host2.example.com failed due to the same reason that self signed server certificate does not have a valid self-signed CA certificate to verify. This caused the failure in the first place. It would be nice if the error message is more verbose to identify the first error than producing error on the fallback logic which makes no sense to user.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org