You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2018/06/04 15:06:00 UTC

[jira] [Commented] (HADOOP-15487) ConcurrentModificationException resulting in Kerberos authentication error.

    [ https://issues.apache.org/jira/browse/HADOOP-15487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500348#comment-16500348 ] 

Daryn Sharp commented on HADOOP-15487:
--------------------------------------

The second exception is an unrelated jdk bug fixed in 8u161.  [JDK-8170278: ticket renewal won't happen with debugging turned on|https://bugs.openjdk.java.net/browse/JDK-8170278].  The gssapi is smart recognizes and handles expired tickets from a keytab.  The problem is {{KerberosTicket#toString}} throws the ISE if it's expired. Easy workaround is don't enable debug logging.

The original issue is distinct.  If there truly are no custom plugins, it may be related to curator/zookeeper/AuthenticatedURL.  What is the specific apache release?  Did the server recover?

We may need to consider using a distinct subject/ugi for rpc servers to prevent other code munging our JASS, but there are a few possible grues lurking there.



> ConcurrentModificationException resulting in Kerberos authentication error.
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-15487
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15487
>             Project: Hadoop Common
>          Issue Type: Bug
>         Environment: CDH 5.13.3. Kerberized, Hadoop-HA, jdk1.8.0_152
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> We found the following exception message in a NameNode log. It seems the ConcurrentModificationException caused Kerberos authentication error.
> It appears to be a JDK bug, similar to HADOOP-13433 (Race in UGI.reloginFromKeytab) but the version of Hadoop (CDH5.13.3) already patched HADOOP-13433. (The stacktrace also differs) This cluster runs on JDK 1.8.0_152.
> {noformat}
> 2018-05-19 04:00:00,182 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs/node1@EXAMPLE.COM (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
> 2018-05-19 04:00:00,183 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8020: readAndProcess from client 10.16.20.122 threw exception [java.util.ConcurrentModificationException]
> java.util.ConcurrentModificationException
>         at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
>         at java.util.LinkedList$ListItr.next(LinkedList.java:888)
>         at javax.security.auth.Subject$SecureSet$1.next(Subject.java:1070)
>         at javax.security.auth.Subject$ClassSet$1.run(Subject.java:1401)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1399)
>         at javax.security.auth.Subject$ClassSet.<init>(Subject.java:1372)
>         at javax.security.auth.Subject.getPrivateCredentials(Subject.java:767)
>         at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:127)
>         at sun.security.jgss.krb5.SubjectComber.findMany(SubjectComber.java:69)
>         at sun.security.jgss.krb5.ServiceCreds.getInstance(ServiceCreds.java:96)
>         at sun.security.jgss.krb5.Krb5Util.getServiceCreds(Krb5Util.java:203)
>         at sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:74)
>         at sun.security.jgss.krb5.Krb5AcceptCredential$1.run(Krb5AcceptCredential.java:72)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at sun.security.jgss.krb5.Krb5AcceptCredential.getInstance(Krb5AcceptCredential.java:71)
>         at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:127)
>         at sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
>         at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427)
>         at sun.security.jgss.GSSCredentialImpl.<init>(GSSCredentialImpl.java:62)
>         at sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154)
>         at com.sun.security.sasl.gsskerb.GssKrb5Server.<init>(GssKrb5Server.java:108)
>         at com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>         at org.apache.hadoop.security.SaslRpcServer$FastSaslServerFactory.createSaslServer(SaslRpcServer.java:398)
>         at org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:164)
>         at org.apache.hadoop.security.SaslRpcServer$1.run(SaslRpcServer.java:161)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>         at org.apache.hadoop.security.SaslRpcServer.create(SaslRpcServer.java:160)
>         at org.apache.hadoop.ipc.Server$Connection.createSaslServer(Server.java:1742)
>         at org.apache.hadoop.ipc.Server$Connection.processSaslMessage(Server.java:1522)
>         at org.apache.hadoop.ipc.Server$Connection.saslProcess(Server.java:1433)
>         at org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1396)
>         at org.apache.hadoop.ipc.Server$Connection.processRpcOutOfBandRequest(Server.java:2080)
>         at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1920)
>         at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1682)
>         at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:896)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:752)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:723)
> {noformat}
> We saw a few GSSException in the NN log, but only one threw the ConcurrentModificationException. This NN had a failover, which is caused by ZKFC having GSSException too. Suspect it's related issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org